* Pinned, non-revocable mappings of VRAM: will bad things happen? @ 2026-04-15 23:27 Demi Marie Obenour 2026-04-16 9:57 ` Christian König 0 siblings, 1 reply; 15+ messages in thread From: Demi Marie Obenour @ 2026-04-15 23:27 UTC (permalink / raw) To: dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Christian König, Suwit Semal [-- Attachment #1.1.1: Type: text/plain, Size: 862 bytes --] Is it safe to assume that if a dmabuf exporter cannot handle non-revocable, pinned importers, it will fail the import? Or is using dma_buf_pin() unsafe if one does not know the exporter? For context, Xen grant tables do not support revocation. One can ask the guest to unmap the grants, but if the guest doesn't obey the only recourse is to ungracefully kill it. They also do not support page faults, so the pages must be pinned. Right now, grant tables don't support PCI BAR mappings, but that's fixable. How badly is this going to break with dGPU VRAM, if at all? I know that AMDGPU has a fallback when the BAR isn't mappable. What about other drivers? Supporting page faults the way KVM does is going to be extremely hard, so pinned mappings and DMA transfers are vastly preferable. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-15 23:27 Pinned, non-revocable mappings of VRAM: will bad things happen? Demi Marie Obenour @ 2026-04-16 9:57 ` Christian König 2026-04-16 16:13 ` Demi Marie Obenour 0 siblings, 1 reply; 15+ messages in thread From: Christian König @ 2026-04-16 9:57 UTC (permalink / raw) To: Demi Marie Obenour, dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Suwit Semal On 4/16/26 01:27, Demi Marie Obenour wrote: > Is it safe to assume that if a dmabuf exporter cannot handle > non-revocable, pinned importers, it will fail the import? Or is > using dma_buf_pin() unsafe if one does not know the exporter? Neither. dma_buf_pin() makes sure that the importer doesn't get any invalidation notifications because the exporter moves the backing store of the buffer around for memory management. But what is still possible is that the exporter is hot removed, in which case the importer should basically terminate it's DMA operation as soon as possible. GPU drivers usually reject pin requests to VRAM from DMA-buf importers when that isn't restricted by cgroups for example, because that can otherwise easily result in a deny of service. Amdgpu only recently started to allow pinning into VRAM to support RDMA without ODP (I think it was ODP, but could be that I mixed up the RDMA three letter code for that feature). > For context, Xen grant tables do not support revocation. One can ask > the guest to unmap the grants, but if the guest doesn't obey the only > recourse is to ungracefully kill it. They also do not support page > faults, so the pages must be pinned. Right now, grant tables don't > support PCI BAR mappings, but that's fixable. That sounds like an use case for the DMA-buf pin interface. > How badly is this going to break with dGPU VRAM, if at all? I know > that AMDGPU has a fallback when the BAR isn't mappable. What about > other drivers? Supporting page faults the way KVM does is going to > be extremely hard, so pinned mappings and DMA transfers are vastly > preferable. Well if you only want to share a fixed amount of VRAM then that is pretty much ok. But when the client VM can trigger pinning on demand without any limitation you can pretty easily have deny of service against the host. That is usually a rather bad idea. Regards, Christian. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-16 9:57 ` Christian König @ 2026-04-16 16:13 ` Demi Marie Obenour 2026-04-17 7:53 ` Christian König 0 siblings, 1 reply; 15+ messages in thread From: Demi Marie Obenour @ 2026-04-16 16:13 UTC (permalink / raw) To: Christian König, dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Suwit Semal [-- Attachment #1.1.1: Type: text/plain, Size: 2361 bytes --] On 4/16/26 05:57, Christian König wrote: > On 4/16/26 01:27, Demi Marie Obenour wrote: >> Is it safe to assume that if a dmabuf exporter cannot handle >> non-revocable, pinned importers, it will fail the import? Or is >> using dma_buf_pin() unsafe if one does not know the exporter? > > Neither. > > dma_buf_pin() makes sure that the importer doesn't get any invalidation notifications because the exporter moves the backing store of the buffer around for memory management. > > But what is still possible is that the exporter is hot removed, in which case the importer should basically terminate it's DMA operation as soon as possible. > > GPU drivers usually reject pin requests to VRAM from DMA-buf importers when that isn't restricted by cgroups for example, because that can otherwise easily result in a deny of service. > > Amdgpu only recently started to allow pinning into VRAM to support RDMA without ODP (I think it was ODP, but could be that I mixed up the RDMA three letter code for that feature). > >> For context, Xen grant tables do not support revocation. One can ask >> the guest to unmap the grants, but if the guest doesn't obey the only >> recourse is to ungracefully kill it. They also do not support page >> faults, so the pages must be pinned. Right now, grant tables don't >> support PCI BAR mappings, but that's fixable. > > That sounds like an use case for the DMA-buf pin interface. > >> How badly is this going to break with dGPU VRAM, if at all? I know >> that AMDGPU has a fallback when the BAR isn't mappable. What about >> other drivers? Supporting page faults the way KVM does is going to >> be extremely hard, so pinned mappings and DMA transfers are vastly >> preferable. > > Well if you only want to share a fixed amount of VRAM then that is pretty much ok. > > But when the client VM can trigger pinning on demand without any limitation you can pretty easily have deny of service against the host. That is usually a rather bad idea. Is there a reasonable way to choose such an amount? Unless I am mistaken, client workloads are highly non-uniform: a single game or compute job might well use more VRAM than every other program on the system combined. Are these workloads impossible to make work well with pinning? -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-16 16:13 ` Demi Marie Obenour @ 2026-04-17 7:53 ` Christian König 2026-04-17 19:35 ` Demi Marie Obenour 0 siblings, 1 reply; 15+ messages in thread From: Christian König @ 2026-04-17 7:53 UTC (permalink / raw) To: Demi Marie Obenour, dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Suwit Semal On 4/16/26 18:13, Demi Marie Obenour wrote: > On 4/16/26 05:57, Christian König wrote: >> On 4/16/26 01:27, Demi Marie Obenour wrote: >>> Is it safe to assume that if a dmabuf exporter cannot handle >>> non-revocable, pinned importers, it will fail the import? Or is >>> using dma_buf_pin() unsafe if one does not know the exporter? >> >> Neither. >> >> dma_buf_pin() makes sure that the importer doesn't get any invalidation notifications because the exporter moves the backing store of the buffer around for memory management. >> >> But what is still possible is that the exporter is hot removed, in which case the importer should basically terminate it's DMA operation as soon as possible. >> >> GPU drivers usually reject pin requests to VRAM from DMA-buf importers when that isn't restricted by cgroups for example, because that can otherwise easily result in a deny of service. >> >> Amdgpu only recently started to allow pinning into VRAM to support RDMA without ODP (I think it was ODP, but could be that I mixed up the RDMA three letter code for that feature). >> >>> For context, Xen grant tables do not support revocation. One can ask >>> the guest to unmap the grants, but if the guest doesn't obey the only >>> recourse is to ungracefully kill it. They also do not support page >>> faults, so the pages must be pinned. Right now, grant tables don't >>> support PCI BAR mappings, but that's fixable. >> >> That sounds like an use case for the DMA-buf pin interface. >> >>> How badly is this going to break with dGPU VRAM, if at all? I know >>> that AMDGPU has a fallback when the BAR isn't mappable. What about >>> other drivers? Supporting page faults the way KVM does is going to >>> be extremely hard, so pinned mappings and DMA transfers are vastly >>> preferable. >> >> Well if you only want to share a fixed amount of VRAM then that is pretty much ok. >> >> But when the client VM can trigger pinning on demand without any limitation you can pretty easily have deny of service against the host. That is usually a rather bad idea. > > Is there a reasonable way to choose such an amount? Not really. > Unless I am > mistaken, client workloads are highly non-uniform: a single game or > compute job might well use more VRAM than every other program on the > system combined. Yeah, perfectly correct. > Are these workloads impossible to make work well with pinning? No, as long as you don't know the workload beforehand, e.g. when you define the limit. I mean that's why basically everybody avoids pinning and assigning fixed amounts of resources. Even if you can make it work technically pinning usually results in a rather bad end user experience. Regards, Christian. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-17 7:53 ` Christian König @ 2026-04-17 19:35 ` Demi Marie Obenour 2026-04-20 8:49 ` Christian König 0 siblings, 1 reply; 15+ messages in thread From: Demi Marie Obenour @ 2026-04-17 19:35 UTC (permalink / raw) To: Christian König, dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Suwit Semal [-- Attachment #1.1.1: Type: text/plain, Size: 3815 bytes --] On 4/17/26 03:53, Christian König wrote: > On 4/16/26 18:13, Demi Marie Obenour wrote: >> On 4/16/26 05:57, Christian König wrote: >>> On 4/16/26 01:27, Demi Marie Obenour wrote: >>>> Is it safe to assume that if a dmabuf exporter cannot handle >>>> non-revocable, pinned importers, it will fail the import? Or is >>>> using dma_buf_pin() unsafe if one does not know the exporter? >>> >>> Neither. >>> >>> dma_buf_pin() makes sure that the importer doesn't get any invalidation notifications because the exporter moves the backing store of the buffer around for memory management. >>> >>> But what is still possible is that the exporter is hot removed, in which case the importer should basically terminate it's DMA operation as soon as possible. >>> >>> GPU drivers usually reject pin requests to VRAM from DMA-buf importers when that isn't restricted by cgroups for example, because that can otherwise easily result in a deny of service. >>> >>> Amdgpu only recently started to allow pinning into VRAM to support RDMA without ODP (I think it was ODP, but could be that I mixed up the RDMA three letter code for that feature). >>> >>>> For context, Xen grant tables do not support revocation. One can ask >>>> the guest to unmap the grants, but if the guest doesn't obey the only >>>> recourse is to ungracefully kill it. They also do not support page >>>> faults, so the pages must be pinned. Right now, grant tables don't >>>> support PCI BAR mappings, but that's fixable. >>> >>> That sounds like an use case for the DMA-buf pin interface. >>> >>>> How badly is this going to break with dGPU VRAM, if at all? I know >>>> that AMDGPU has a fallback when the BAR isn't mappable. What about >>>> other drivers? Supporting page faults the way KVM does is going to >>>> be extremely hard, so pinned mappings and DMA transfers are vastly >>>> preferable. >>> >>> Well if you only want to share a fixed amount of VRAM then that is pretty much ok. >>> >>> But when the client VM can trigger pinning on demand without any limitation you can pretty easily have deny of service against the host. That is usually a rather bad idea. >> >> Is there a reasonable way to choose such an amount? > > Not really. > >> Unless I am >> mistaken, client workloads are highly non-uniform: a single game or >> compute job might well use more VRAM than every other program on the >> system combined. > > Yeah, perfectly correct. > >> Are these workloads impossible to make work well with pinning? > > No, as long as you don't know the workload beforehand, e.g. when you define the limit. > > I mean that's why basically everybody avoids pinning and assigning fixed amounts of resources. > > Even if you can make it work technically pinning usually results in a rather bad end user experience. > > Regards, > Christian. Do drivers and programs assume that they can access VRAM from the CPU? Are any of the following reasonable options? 1. Change the guest kernel to only map (and thus pin) a small subset of VRAM at any given time. If unmapped VRAM is accessed the guest traps the page fault, evicts an old VRAM mapping, and creates a new one. 2. Pretend that resizable BAR is not enabled, so the guest doesn't think it can map much of VRAM at once. If resizable BAR is enabled on the host, it might be possible to split the large BAR mapping in a lot of ways. Or does Xen really need to allow the host to handle guest page faults? That adds a huge amount of complexity to trusted and security-critical parts of the system, so it really is a last resort. Putting the complexity in to the guest virtio-GPU driver is vastly preferable if it can be made to work well. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-17 19:35 ` Demi Marie Obenour @ 2026-04-20 8:49 ` Christian König 2026-04-20 17:03 ` Demi Marie Obenour 0 siblings, 1 reply; 15+ messages in thread From: Christian König @ 2026-04-20 8:49 UTC (permalink / raw) To: Demi Marie Obenour, dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Suwit Semal On 4/17/26 21:35, Demi Marie Obenour wrote: > On 4/17/26 03:53, Christian König wrote: >> On 4/16/26 18:13, Demi Marie Obenour wrote: >>> On 4/16/26 05:57, Christian König wrote: ... >>> Unless I am >>> mistaken, client workloads are highly non-uniform: a single game or >>> compute job might well use more VRAM than every other program on the >>> system combined. >> >> Yeah, perfectly correct. >> >>> Are these workloads impossible to make work well with pinning? >> >> No, as long as you don't know the workload beforehand, e.g. when you define the limit. >> >> I mean that's why basically everybody avoids pinning and assigning fixed amounts of resources. >> >> Even if you can make it work technically pinning usually results in a rather bad end user experience. >> >> Regards, >> Christian. > > Do drivers and programs assume that they can access VRAM from the CPU? Yes, and that is actually really important for performance. That's why Alex and I came up with the idea of using the resize able BAR feature to access all of VRAM on modern GPUs. There are a couple of hacks which have been implemented over the years for exotic platforms were MMIO/VRAM access was problematic. For example on a page fault you use a GPU DMA engine to copy the VRAM buffer into system memory, make the CPU memory access and then copy it back again on demand at the next command submission. But all of those hacks are basically just prove of concepts and result in completely unusable performance. > Are any of the following reasonable options? > > 1. Change the guest kernel to only map (and thus pin) a small subset > of VRAM at any given time. If unmapped VRAM is accessed the guest > traps the page fault, evicts an old VRAM mapping, and creates a > new one. Yeah, that could potentially work. This is basically what we do on the host kernel driver when we can't resize the BAR for some reason. In that use case VRAM buffers are shuffled in and out of the CPU accessible window of VRAM on demand. > 2. Pretend that resizable BAR is not enabled, so the guest doesn't > think it can map much of VRAM at once. If resizable BAR is enabled > on the host, it might be possible to split the large BAR mapping > in a lot of ways. That won't work. The userspace parts of the driver stack don't care how large the BAR to access VRAM with the CPU is. The expectation is that the kernel driver makes thing CPU accessible as needed in the page fault handler. It is still a good idea for your solution #1 to give the amount of "pin-able" VRAM to the userspace stack as CPU visible VRAM limit so that test cases and applications try to lower their usage of VRAM, e.g. use system memory bounce buffers when possible. > Or does Xen really need to allow the host to handle guest page faults? > That adds a huge amount of complexity to trusted and security-critical > parts of the system, so it really is a last resort. Putting the > complexity in to the guest virtio-GPU driver is vastly preferable if > it can be made to work well. Well the nested page fault handling KVM offers has proven to be extremely useful. So when XEN can't do this it is clearly lacking an important feature. But I have one question: When XEN has a problem handling faults from the guest on the host then how does that work for system memory mappings? There is really no difference between VRAM and system memory in the handling for the GPU driver stack. Regards, Christian. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-20 8:49 ` Christian König @ 2026-04-20 17:03 ` Demi Marie Obenour 2026-04-20 17:58 ` Christian König 0 siblings, 1 reply; 15+ messages in thread From: Demi Marie Obenour @ 2026-04-20 17:03 UTC (permalink / raw) To: Christian König, dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Suwit Semal [-- Attachment #1.1.1: Type: text/plain, Size: 5018 bytes --] On 4/20/26 04:49, Christian König wrote: > On 4/17/26 21:35, Demi Marie Obenour wrote: >> On 4/17/26 03:53, Christian König wrote: >>> On 4/16/26 18:13, Demi Marie Obenour wrote: >>>> On 4/16/26 05:57, Christian König wrote: > ... >>>> Unless I am >>>> mistaken, client workloads are highly non-uniform: a single game or >>>> compute job might well use more VRAM than every other program on the >>>> system combined. >>> >>> Yeah, perfectly correct. >>> >>>> Are these workloads impossible to make work well with pinning? >>> >>> No, as long as you don't know the workload beforehand, e.g. when you define the limit. >>> >>> I mean that's why basically everybody avoids pinning and assigning fixed amounts of resources. >>> >>> Even if you can make it work technically pinning usually results in a rather bad end user experience. >>> >>> Regards, >>> Christian. >> >> Do drivers and programs assume that they can access VRAM from the CPU? > > Yes, and that is actually really important for performance. > > That's why Alex and I came up with the idea of using the resize able BAR feature to access all of VRAM on modern GPUs. > > There are a couple of hacks which have been implemented over the years for exotic platforms were MMIO/VRAM access was problematic. For example on a page fault you use a GPU DMA engine to copy the VRAM buffer into system memory, make the CPU memory access and then copy it back again on demand at the next command submission. > > But all of those hacks are basically just prove of concepts and result in completely unusable performance. I'm reminded of Asahi considering emulating unaligned accesses to PCI BARs in a fault handler. >> Are any of the following reasonable options? >> >> 1. Change the guest kernel to only map (and thus pin) a small subset >> of VRAM at any given time. If unmapped VRAM is accessed the guest >> traps the page fault, evicts an old VRAM mapping, and creates a >> new one. > > Yeah, that could potentially work. > > This is basically what we do on the host kernel driver when we can't resize the BAR for some reason. In that use case VRAM buffers are shuffled in and out of the CPU accessible window of VRAM on demand. How much is this going to hurt performance? >> 2. Pretend that resizable BAR is not enabled, so the guest doesn't >> think it can map much of VRAM at once. If resizable BAR is enabled >> on the host, it might be possible to split the large BAR mapping >> in a lot of ways. > > That won't work. The userspace parts of the driver stack don't care how large the BAR to access VRAM with the CPU is. > > The expectation is that the kernel driver makes thing CPU accessible as needed in the page fault handler. > > It is still a good idea for your solution #1 to give the amount of "pin-able" VRAM to the userspace stack as CPU visible VRAM limit so that test cases and applications try to lower their usage of VRAM, e.g. use system memory bounce buffers when possible. That makes sense. >> Or does Xen really need to allow the host to handle guest page faults? >> That adds a huge amount of complexity to trusted and security-critical >> parts of the system, so it really is a last resort. Putting the >> complexity in to the guest virtio-GPU driver is vastly preferable if >> it can be made to work well. > > Well the nested page fault handling KVM offers has proven to be extremely useful. So when XEN can't do this it is clearly lacking an important feature. I agree. However, it is a lot of work to implement, which is why I'm looking for alternatives if possible. KVM is part of the Linux kernel, so it can just call the Linux kernel functions used to handle userspace page faults. Xen is separate from Linux, so it can't do that. Instead, it will need to: 1. Determine that the fault needs to be handled by another VM, and the ID of the VM that needs to handle the fault. 2. Send a message to the VM asking it to handle the fault. 3. Block the vCPU until it gets a response. Then the VM owning the memory will need to call the page fault handler and provide the memory to Xen. Xen then needs to: 4. Map the memory into the nested page tables of the VM that faulted. 5. Resume the vCPU. > But I have one question: When XEN has a problem handling faults from the guest on the host then how does that work for system memory mappings? > > There is really no difference between VRAM and system memory in the handling for the GPU driver stack. > > Regards, > Christian. Generally, Xen makes the frontend (usually an unprivileged VM) responsible for providing mappings to the backend (usually the host). That is possible with system RAM but not with VRAM, because Xen has no awareness of VRAM. To Xen, VRAM is just a PCI BAR. KVM runs in the same kernel as the GPU driver. Xen doesn't, and that is the source of the extra complexity. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-20 17:03 ` Demi Marie Obenour @ 2026-04-20 17:58 ` Christian König 2026-04-20 18:46 ` Demi Marie Obenour 0 siblings, 1 reply; 15+ messages in thread From: Christian König @ 2026-04-20 17:58 UTC (permalink / raw) To: Demi Marie Obenour, dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Suwit Semal On 4/20/26 19:03, Demi Marie Obenour wrote: > On 4/20/26 04:49, Christian König wrote: >> On 4/17/26 21:35, Demi Marie Obenour wrote: ... >>> Are any of the following reasonable options? >>> >>> 1. Change the guest kernel to only map (and thus pin) a small subset >>> of VRAM at any given time. If unmapped VRAM is accessed the guest >>> traps the page fault, evicts an old VRAM mapping, and creates a >>> new one. >> >> Yeah, that could potentially work. >> >> This is basically what we do on the host kernel driver when we can't resize the BAR for some reason. In that use case VRAM buffers are shuffled in and out of the CPU accessible window of VRAM on demand. > > How much is this going to hurt performance? Hard to say, resizing the BAR can easily give you 10-15% more performance on some use cases. But that involves physically transferring the data using a DMA. For this solution we basically only have to we basically only have to transfer a few messages between host and guest. No idea how performant that is. >>> 2. Pretend that resizable BAR is not enabled, so the guest doesn't >>> think it can map much of VRAM at once. If resizable BAR is enabled >>> on the host, it might be possible to split the large BAR mapping >>> in a lot of ways. >> >> That won't work. The userspace parts of the driver stack don't care how large the BAR to access VRAM with the CPU is. >> >> The expectation is that the kernel driver makes thing CPU accessible as needed in the page fault handler. >> >> It is still a good idea for your solution #1 to give the amount of "pin-able" VRAM to the userspace stack as CPU visible VRAM limit so that test cases and applications try to lower their usage of VRAM, e.g. use system memory bounce buffers when possible. > > That makes sense. > >>> Or does Xen really need to allow the host to handle guest page faults? >>> That adds a huge amount of complexity to trusted and security-critical >>> parts of the system, so it really is a last resort. Putting the >>> complexity in to the guest virtio-GPU driver is vastly preferable if >>> it can be made to work well. >> >> Well the nested page fault handling KVM offers has proven to be extremely useful. So when XEN can't do this it is clearly lacking an important feature. > > I agree. However, it is a lot of work to implement, which is why I'm > looking for alternatives if possible. > > KVM is part of the Linux kernel, so it can just call the Linux kernel > functions used to handle userspace page faults. Xen is separate from > Linux, so it can't do that. Instead, it will need to: > > 1. Determine that the fault needs to be handled by another VM, and > the ID of the VM that needs to handle the fault. > 2. Send a message to the VM asking it to handle the fault. > 3. Block the vCPU until it gets a response. > > Then the VM owning the memory will need to call the page fault handler > and provide the memory to Xen. Xen then needs to: > > 4. Map the memory into the nested page tables of the VM that faulted. > 5. Resume the vCPU. > >> But I have one question: When XEN has a problem handling faults from the guest on the host then how does that work for system memory mappings? >> >> There is really no difference between VRAM and system memory in the handling for the GPU driver stack. >> >> Regards, >> Christian. > > Generally, Xen makes the frontend (usually an unprivileged VM) > responsible for providing mappings to the backend (usually the host). > That is possible with system RAM but not with VRAM, because Xen has > no awareness of VRAM. To Xen, VRAM is just a PCI BAR. No, that doesn't work with system memory allocations of GPU drivers either. We already had it multiple times that people tried to be clever and incremented the page reference counter on driver allocated system memory and were totally surprised that this can result in security issues and data corruption. I seriously hope that this isn't the case here again. As far as I know XEN already has support for accessing VMAs with VM_PFN or otherwise I don't know how driver allocated system memory access could potentially work. Accessing VRAM is pretty much the same use case as far as I can see. Regards, Christian. > KVM runs in the same kernel as the GPU driver. Xen doesn't, and that > is the source of the extra complexity. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-20 17:58 ` Christian König @ 2026-04-20 18:46 ` Demi Marie Obenour 2026-04-20 18:53 ` Christian König 0 siblings, 1 reply; 15+ messages in thread From: Demi Marie Obenour @ 2026-04-20 18:46 UTC (permalink / raw) To: Christian König, dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Suwit Semal [-- Attachment #1.1.1: Type: text/plain, Size: 5497 bytes --] On 4/20/26 13:58, Christian König wrote: > On 4/20/26 19:03, Demi Marie Obenour wrote: >> On 4/20/26 04:49, Christian König wrote: >>> On 4/17/26 21:35, Demi Marie Obenour wrote: > ... >>>> Are any of the following reasonable options? >>>> >>>> 1. Change the guest kernel to only map (and thus pin) a small subset >>>> of VRAM at any given time. If unmapped VRAM is accessed the guest >>>> traps the page fault, evicts an old VRAM mapping, and creates a >>>> new one. >>> >>> Yeah, that could potentially work. >>> >>> This is basically what we do on the host kernel driver when we can't resize the BAR for some reason. In that use case VRAM buffers are shuffled in and out of the CPU accessible window of VRAM on demand. >> >> How much is this going to hurt performance? > > Hard to say, resizing the BAR can easily give you 10-15% more performance on some use cases. > > But that involves physically transferring the data using a DMA. For this solution we basically only have to we basically only have to transfer a few messages between host and guest. > > No idea how performant that is. In this use-case, 20-30% performance penalties are likely to be "business as usual". Close to native performance would be ideal, but to be useful it just needs to beat software rendering by a wide margin, and not cause data corruption or vulnerabilities. >>>> 2. Pretend that resizable BAR is not enabled, so the guest doesn't >>>> think it can map much of VRAM at once. If resizable BAR is enabled >>>> on the host, it might be possible to split the large BAR mapping >>>> in a lot of ways. >>> >>> That won't work. The userspace parts of the driver stack don't care how large the BAR to access VRAM with the CPU is. >>> >>> The expectation is that the kernel driver makes thing CPU accessible as needed in the page fault handler. >>> >>> It is still a good idea for your solution #1 to give the amount of "pin-able" VRAM to the userspace stack as CPU visible VRAM limit so that test cases and applications try to lower their usage of VRAM, e.g. use system memory bounce buffers when possible. >> >> That makes sense. >> >>>> Or does Xen really need to allow the host to handle guest page faults? >>>> That adds a huge amount of complexity to trusted and security-critical >>>> parts of the system, so it really is a last resort. Putting the >>>> complexity in to the guest virtio-GPU driver is vastly preferable if >>>> it can be made to work well. >>> >>> Well the nested page fault handling KVM offers has proven to be extremely useful. So when XEN can't do this it is clearly lacking an important feature. >> >> I agree. However, it is a lot of work to implement, which is why I'm >> looking for alternatives if possible. >> >> KVM is part of the Linux kernel, so it can just call the Linux kernel >> functions used to handle userspace page faults. Xen is separate from >> Linux, so it can't do that. Instead, it will need to: >> >> 1. Determine that the fault needs to be handled by another VM, and >> the ID of the VM that needs to handle the fault. >> 2. Send a message to the VM asking it to handle the fault. >> 3. Block the vCPU until it gets a response. >> >> Then the VM owning the memory will need to call the page fault handler >> and provide the memory to Xen. Xen then needs to: >> >> 4. Map the memory into the nested page tables of the VM that faulted. >> 5. Resume the vCPU. >> >>> But I have one question: When XEN has a problem handling faults from the guest on the host then how does that work for system memory mappings? >>> >>> There is really no difference between VRAM and system memory in the handling for the GPU driver stack. >>> >>> Regards, >>> Christian. >> >> Generally, Xen makes the frontend (usually an unprivileged VM) >> responsible for providing mappings to the backend (usually the host). >> That is possible with system RAM but not with VRAM, because Xen has >> no awareness of VRAM. To Xen, VRAM is just a PCI BAR. > > No, that doesn't work with system memory allocations of GPU drivers either. > > We already had it multiple times that people tried to be clever and incremented the page reference counter on driver allocated system memory and were totally surprised that this can result in security issues and data corruption. > > I seriously hope that this isn't the case here again. As far as I know XEN already has support for accessing VMAs with VM_PFN or otherwise I don't know how driver allocated system memory access could potentially work. > > Accessing VRAM is pretty much the same use case as far as I can see. > > Regards, > Christian. The Xen-native approach would be for system memory allocations to be made using the Xen driver and then imported into the virtio-GPU driver via dmabuf. Is there any chance this could be made to happen? If it's a lost cause, then how much is the memory overhead of pinning everything ever used in a dmabuf? It should be possible to account pinned host memory against a guest's quota, but if that leads to an unusable system it isn't going to be good. Is supporting page faults in Xen the only solution that will be viable long-term, considering the tolerance for very substantial performance overheads compared to native? AAA gaming isn't the initial goal here. Qubes OS already supports PCI passthrough for that. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-20 18:46 ` Demi Marie Obenour @ 2026-04-20 18:53 ` Christian König 2026-04-20 19:12 ` Demi Marie Obenour 0 siblings, 1 reply; 15+ messages in thread From: Christian König @ 2026-04-20 18:53 UTC (permalink / raw) To: Demi Marie Obenour, dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Suwit Semal, Pelloux-Prayer, Pierre-Eric On 4/20/26 20:46, Demi Marie Obenour wrote: > On 4/20/26 13:58, Christian König wrote: >> On 4/20/26 19:03, Demi Marie Obenour wrote: >>> On 4/20/26 04:49, Christian König wrote: >>>> On 4/17/26 21:35, Demi Marie Obenour wrote: >> ... >>>>> Are any of the following reasonable options? >>>>> >>>>> 1. Change the guest kernel to only map (and thus pin) a small subset >>>>> of VRAM at any given time. If unmapped VRAM is accessed the guest >>>>> traps the page fault, evicts an old VRAM mapping, and creates a >>>>> new one. >>>> >>>> Yeah, that could potentially work. >>>> >>>> This is basically what we do on the host kernel driver when we can't resize the BAR for some reason. In that use case VRAM buffers are shuffled in and out of the CPU accessible window of VRAM on demand. >>> >>> How much is this going to hurt performance? >> >> Hard to say, resizing the BAR can easily give you 10-15% more performance on some use cases. >> >> But that involves physically transferring the data using a DMA. For this solution we basically only have to we basically only have to transfer a few messages between host and guest. >> >> No idea how performant that is. > > In this use-case, 20-30% performance penalties are likely to be > "business as usual". Well that is quite a bit. > Close to native performance would be ideal, but > to be useful it just needs to beat software rendering by a wide margin, > and not cause data corruption or vulnerabilities. That should still easily be the case, even trivial use cases are multiple magnitudes faster on GPUs compared to software rendering. >>> >>>> But I have one question: When XEN has a problem handling faults from the guest on the host then how does that work for system memory mappings? >>>> >>>> There is really no difference between VRAM and system memory in the handling for the GPU driver stack. >>>> >>>> Regards, >>>> Christian. >>> >>> Generally, Xen makes the frontend (usually an unprivileged VM) >>> responsible for providing mappings to the backend (usually the host). >>> That is possible with system RAM but not with VRAM, because Xen has >>> no awareness of VRAM. To Xen, VRAM is just a PCI BAR. >> >> No, that doesn't work with system memory allocations of GPU drivers either. >> >> We already had it multiple times that people tried to be clever and incremented the page reference counter on driver allocated system memory and were totally surprised that this can result in security issues and data corruption. >> >> I seriously hope that this isn't the case here again. As far as I know XEN already has support for accessing VMAs with VM_PFN or otherwise I don't know how driver allocated system memory access could potentially work. >> >> Accessing VRAM is pretty much the same use case as far as I can see. >> >> Regards, >> Christian. > > The Xen-native approach would be for system memory allocations to > be made using the Xen driver and then imported into the virtio-GPU > driver via dmabuf. Is there any chance this could be made to happen? That could be. Adding Pierre-Eric to comment since he knows that use much better than I do. > If it's a lost cause, then how much is the memory overhead of pinning > everything ever used in a dmabuf? It should be possible to account > pinned host memory against a guest's quota, but if that leads to an > unusable system it isn't going to be good. That won't work at all. We have use cases where you *must* migrate a DMA-buf to VRAM or otherwise the GPU can't use it. A simple scanout to a monitor is such an use case for example, that is usually not possible from system memory. > Is supporting page faults in Xen the only solution that will be viable > long-term, considering the tolerance for very substantial performance > overheads compared to native? AAA gaming isn't the initial goal here. > Qubes OS already supports PCI passthrough for that. We have AAA gaming working on XEN through native context working for quite a while. Pierre-Eric can tell you more about that. Regards, Christian. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-20 18:53 ` Christian König @ 2026-04-20 19:12 ` Demi Marie Obenour 2026-04-21 16:55 ` Val Packett 0 siblings, 1 reply; 15+ messages in thread From: Demi Marie Obenour @ 2026-04-20 19:12 UTC (permalink / raw) To: Christian König, dri-devel, Xen developer discussion, linux-media Cc: Val Packett, Suwit Semal, Pelloux-Prayer, Pierre-Eric [-- Attachment #1.1.1: Type: text/plain, Size: 4886 bytes --] On 4/20/26 14:53, Christian König wrote: > On 4/20/26 20:46, Demi Marie Obenour wrote: >> On 4/20/26 13:58, Christian König wrote: >>> On 4/20/26 19:03, Demi Marie Obenour wrote: >>>> On 4/20/26 04:49, Christian König wrote: >>>>> On 4/17/26 21:35, Demi Marie Obenour wrote: >>> ... >>>>>> Are any of the following reasonable options? >>>>>> >>>>>> 1. Change the guest kernel to only map (and thus pin) a small subset >>>>>> of VRAM at any given time. If unmapped VRAM is accessed the guest >>>>>> traps the page fault, evicts an old VRAM mapping, and creates a >>>>>> new one. >>>>> >>>>> Yeah, that could potentially work. >>>>> >>>>> This is basically what we do on the host kernel driver when we can't resize the BAR for some reason. In that use case VRAM buffers are shuffled in and out of the CPU accessible window of VRAM on demand. >>>> >>>> How much is this going to hurt performance? >>> >>> Hard to say, resizing the BAR can easily give you 10-15% more performance on some use cases. >>> >>> But that involves physically transferring the data using a DMA. For this solution we basically only have to we basically only have to transfer a few messages between host and guest. >>> >>> No idea how performant that is. >> >> In this use-case, 20-30% performance penalties are likely to be >> "business as usual". > > Well that is quite a bit. > >> Close to native performance would be ideal, but >> to be useful it just needs to beat software rendering by a wide margin, >> and not cause data corruption or vulnerabilities. > > That should still easily be the case, even trivial use cases are multiple magnitudes faster on GPUs compared to software rendering. Makes sense. If only GPUs supported easy and flexible virtualization the way CPUs do :(. >>>> >>>>> But I have one question: When XEN has a problem handling faults from the guest on the host then how does that work for system memory mappings? >>>>> >>>>> There is really no difference between VRAM and system memory in the handling for the GPU driver stack. >>>>> >>>>> Regards, >>>>> Christian. >>>> >>>> Generally, Xen makes the frontend (usually an unprivileged VM) >>>> responsible for providing mappings to the backend (usually the host). >>>> That is possible with system RAM but not with VRAM, because Xen has >>>> no awareness of VRAM. To Xen, VRAM is just a PCI BAR. >>> >>> No, that doesn't work with system memory allocations of GPU drivers either. >>> >>> We already had it multiple times that people tried to be clever and incremented the page reference counter on driver allocated system memory and were totally surprised that this can result in security issues and data corruption. >>> >>> I seriously hope that this isn't the case here again. As far as I know XEN already has support for accessing VMAs with VM_PFN or otherwise I don't know how driver allocated system memory access could potentially work. >>> >>> Accessing VRAM is pretty much the same use case as far as I can see. >>> >>> Regards, >>> Christian. >> >> The Xen-native approach would be for system memory allocations to >> be made using the Xen driver and then imported into the virtio-GPU >> driver via dmabuf. Is there any chance this could be made to happen? > > That could be. Adding Pierre-Eric to comment since he knows that use much better than I do. > >> If it's a lost cause, then how much is the memory overhead of pinning >> everything ever used in a dmabuf? It should be possible to account >> pinned host memory against a guest's quota, but if that leads to an >> unusable system it isn't going to be good. > > That won't work at all. > > We have use cases where you *must* migrate a DMA-buf to VRAM or otherwise the GPU can't use it. > > A simple scanout to a monitor is such an use case for example, that is usually not possible from system memory. Direct scanout isn't a concern here. >> Is supporting page faults in Xen the only solution that will be viable >> long-term, considering the tolerance for very substantial performance >> overheads compared to native? AAA gaming isn't the initial goal here. >> Qubes OS already supports PCI passthrough for that. > > We have AAA gaming working on XEN through native context working for quite a while. > > Pierre-Eric can tell you more about that. > > Regards, > Christian. I've heard of that, but last I checked it required downstream patches to Xen, Linux, and QEMU. I don't know if any of those have been upstreamed since, but I believe that upstreaming the Xen and Linux patches (or rewriting them and upstreaming the rewritten version) would be necessary. Qubes OS (which I don't work for anymore but still want to help with this) almost certainly won't be using QEMU for GPU stuff. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-20 19:12 ` Demi Marie Obenour @ 2026-04-21 16:55 ` Val Packett 2026-04-21 17:43 ` Christian König 2026-04-22 1:27 ` Demi Marie Obenour 0 siblings, 2 replies; 15+ messages in thread From: Val Packett @ 2026-04-21 16:55 UTC (permalink / raw) To: Demi Marie Obenour, Christian König, dri-devel, Xen developer discussion, linux-media Cc: Suwit Semal, Pelloux-Prayer, Pierre-Eric On 4/20/26 4:12 PM, Demi Marie Obenour wrote: > On 4/20/26 14:53, Christian König wrote: >> On 4/20/26 20:46, Demi Marie Obenour wrote: >>> On 4/20/26 13:58, Christian König wrote: >>>> On 4/20/26 19:03, Demi Marie Obenour wrote: >>>>> On 4/20/26 04:49, Christian König wrote: >>>>>> On 4/17/26 21:35, Demi Marie Obenour wrote: >>>> ... >>>>>>> Are any of the following reasonable options? >>>>>>> >>>>>>> 1. Change the guest kernel to only map (and thus pin) a small subset >>>>>>> of VRAM at any given time. If unmapped VRAM is accessed the guest >>>>>>> traps the page fault, evicts an old VRAM mapping, and creates a >>>>>>> new one. >>>>>> Yeah, that could potentially work. >>>>>> >>>>>> This is basically what we do on the host kernel driver when we can't resize the BAR for some reason. In that use case VRAM buffers are shuffled in and out of the CPU accessible window of VRAM on demand. >>>>> How much is this going to hurt performance? >>>> Hard to say, resizing the BAR can easily give you 10-15% more performance on some use cases. >>>> >>>> But that involves physically transferring the data using a DMA. For this solution we basically only have to we basically only have to transfer a few messages between host and guest. >>>> >>>> No idea how performant that is. >>> In this use-case, 20-30% performance penalties are likely to be >>> "business as usual". >> Well that is quite a bit. >> >>> Close to native performance would be ideal, but >>> to be useful it just needs to beat software rendering by a wide margin, >>> and not cause data corruption or vulnerabilities. >> That should still easily be the case, even trivial use cases are multiple magnitudes faster on GPUs compared to software rendering. > Makes sense. If only GPUs supported easy and flexible virtualization the way CPUs do :(. > >>>>>> But I have one question: When XEN has a problem handling faults from the guest on the host then how does that work for system memory mappings? >>>>>> >>>>>> There is really no difference between VRAM and system memory in the handling for the GPU driver stack. >>>>>> >>>>>> Regards, >>>>>> Christian. >>>>> Generally, Xen makes the frontend (usually an unprivileged VM) >>>>> responsible for providing mappings to the backend (usually the host). >>>>> That is possible with system RAM but not with VRAM, because Xen has >>>>> no awareness of VRAM. To Xen, VRAM is just a PCI BAR. >>>> No, that doesn't work with system memory allocations of GPU drivers either. >>>> >>>> We already had it multiple times that people tried to be clever and incremented the page reference counter on driver allocated system memory and were totally surprised that this can result in security issues and data corruption. >>>> >>>> I seriously hope that this isn't the case here again. As far as I know XEN already has support for accessing VMAs with VM_PFN or otherwise I don't know how driver allocated system memory access could potentially work. >>>> >>>> Accessing VRAM is pretty much the same use case as far as I can see. >>>> >>>> Regards, >>>> Christian. >>> The Xen-native approach would be for system memory allocations to >>> be made using the Xen driver and then imported into the virtio-GPU >>> driver via dmabuf. Is there any chance this could be made to happen? >> That could be. Adding Pierre-Eric to comment since he knows that use much better than I do. >> >>> If it's a lost cause, then how much is the memory overhead of pinning >>> everything ever used in a dmabuf? It should be possible to account >>> pinned host memory against a guest's quota, but if that leads to an >>> unusable system it isn't going to be good. >> That won't work at all. >> >> We have use cases where you *must* migrate a DMA-buf to VRAM or otherwise the GPU can't use it. >> >> A simple scanout to a monitor is such an use case for example, that is usually not possible from system memory. > Direct scanout isn't a concern here. > >>> Is supporting page faults in Xen the only solution that will be viable >>> long-term, considering the tolerance for very substantial performance >>> overheads compared to native? AAA gaming isn't the initial goal here. >>> Qubes OS already supports PCI passthrough for that. >> We have AAA gaming working on XEN through native context working for quite a while. >> >> Pierre-Eric can tell you more about that. >> >> Regards, >> Christian. > I've heard of that, but last I checked it required downstream patches > to Xen, Linux, and QEMU. I don't know if any of those have been > upstreamed since, but I believe that upstreaming the Xen and Linux > patches (or rewriting them and upstreaming the rewritten version) would > be necessary. Qubes OS (which I don't work for anymore but still want > to help with this) almost certainly won't be using QEMU for GPU stuff. Yeah, our plan is to use xen-vhost-frontend[1] + vhost-device-gpu, ported/extended/modified as necessary. (I already have xen-vhost-frontend itself working on amd64 PVH with purely xenbus-based hotplug/configuration, currently working on cleaning up and submitting the necessary patches.) I'm curious to hear more details about how AMD has it working but last time I checked, there weren't any missing pieces in Xen or Linux that we'd need.. The AMD downstream changes were mostly related to QEMU. As for the memory management concerns, I would like to remind everyone once again that the pinning of GPU dmabufs in regular graphics workloads would be *very* short-term. In GPU paravirtualization (native contexts or venus or whatever else) the guest mostly operates on *opaque handles* that refer to buffers owned by the host GPU process. The typical rendering process (roughly) only involves submitting commands to the GPU that refer to memory using these handles. Only upon mmap() would a buffer be pinned/granted to the guest, and those are typically only used for *uploads* where the guest immediately does its memcpy() and unmaps the buffer. So I'm not worried about (unintentionally) pinning too much GPU driver memory. In terms of deliberate denial-of-service attacks from the guest to the host, the only reasonable response is: ¯\_(ツ)_/¯ CPU-mapping lots of GPU memory is far from the only DoS vector, the GPU commands themselves can easily wedge the GPU core in a million ways (and last time I checked amdgpu was noooot so good at recovering from hangs). [1]: https://github.com/vireshk/xen-vhost-frontend ~val ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-21 16:55 ` Val Packett @ 2026-04-21 17:43 ` Christian König 2026-04-22 1:27 ` Demi Marie Obenour 1 sibling, 0 replies; 15+ messages in thread From: Christian König @ 2026-04-21 17:43 UTC (permalink / raw) To: Val Packett, Demi Marie Obenour, dri-devel, Xen developer discussion, linux-media Cc: Suwit Semal, Pelloux-Prayer, Pierre-Eric On 4/21/26 18:55, Val Packett wrote: > > On 4/20/26 4:12 PM, Demi Marie Obenour wrote: >> On 4/20/26 14:53, Christian König wrote: >>> On 4/20/26 20:46, Demi Marie Obenour wrote: >>>> On 4/20/26 13:58, Christian König wrote: >>>>> On 4/20/26 19:03, Demi Marie Obenour wrote: >>>>>> On 4/20/26 04:49, Christian König wrote: >>>>>>> On 4/17/26 21:35, Demi Marie Obenour wrote: >>>>> ... >>>>>>>> Are any of the following reasonable options? >>>>>>>> >>>>>>>> 1. Change the guest kernel to only map (and thus pin) a small subset >>>>>>>> of VRAM at any given time. If unmapped VRAM is accessed the guest >>>>>>>> traps the page fault, evicts an old VRAM mapping, and creates a >>>>>>>> new one. >>>>>>> Yeah, that could potentially work. >>>>>>> >>>>>>> This is basically what we do on the host kernel driver when we can't resize the BAR for some reason. In that use case VRAM buffers are shuffled in and out of the CPU accessible window of VRAM on demand. >>>>>> How much is this going to hurt performance? >>>>> Hard to say, resizing the BAR can easily give you 10-15% more performance on some use cases. >>>>> >>>>> But that involves physically transferring the data using a DMA. For this solution we basically only have to we basically only have to transfer a few messages between host and guest. >>>>> >>>>> No idea how performant that is. >>>> In this use-case, 20-30% performance penalties are likely to be >>>> "business as usual". >>> Well that is quite a bit. >>> >>>> Close to native performance would be ideal, but >>>> to be useful it just needs to beat software rendering by a wide margin, >>>> and not cause data corruption or vulnerabilities. >>> That should still easily be the case, even trivial use cases are multiple magnitudes faster on GPUs compared to software rendering. >> Makes sense. If only GPUs supported easy and flexible virtualization the way CPUs do :(. >> >>>>>>> But I have one question: When XEN has a problem handling faults from the guest on the host then how does that work for system memory mappings? >>>>>>> >>>>>>> There is really no difference between VRAM and system memory in the handling for the GPU driver stack. >>>>>>> >>>>>>> Regards, >>>>>>> Christian. >>>>>> Generally, Xen makes the frontend (usually an unprivileged VM) >>>>>> responsible for providing mappings to the backend (usually the host). >>>>>> That is possible with system RAM but not with VRAM, because Xen has >>>>>> no awareness of VRAM. To Xen, VRAM is just a PCI BAR. >>>>> No, that doesn't work with system memory allocations of GPU drivers either. >>>>> >>>>> We already had it multiple times that people tried to be clever and incremented the page reference counter on driver allocated system memory and were totally surprised that this can result in security issues and data corruption. >>>>> >>>>> I seriously hope that this isn't the case here again. As far as I know XEN already has support for accessing VMAs with VM_PFN or otherwise I don't know how driver allocated system memory access could potentially work. >>>>> >>>>> Accessing VRAM is pretty much the same use case as far as I can see. >>>>> >>>>> Regards, >>>>> Christian. >>>> The Xen-native approach would be for system memory allocations to >>>> be made using the Xen driver and then imported into the virtio-GPU >>>> driver via dmabuf. Is there any chance this could be made to happen? >>> That could be. Adding Pierre-Eric to comment since he knows that use much better than I do. >>> >>>> If it's a lost cause, then how much is the memory overhead of pinning >>>> everything ever used in a dmabuf? It should be possible to account >>>> pinned host memory against a guest's quota, but if that leads to an >>>> unusable system it isn't going to be good. >>> That won't work at all. >>> >>> We have use cases where you *must* migrate a DMA-buf to VRAM or otherwise the GPU can't use it. >>> >>> A simple scanout to a monitor is such an use case for example, that is usually not possible from system memory. >> Direct scanout isn't a concern here. >> >>>> Is supporting page faults in Xen the only solution that will be viable >>>> long-term, considering the tolerance for very substantial performance >>>> overheads compared to native? AAA gaming isn't the initial goal here. >>>> Qubes OS already supports PCI passthrough for that. >>> We have AAA gaming working on XEN through native context working for quite a while. >>> >>> Pierre-Eric can tell you more about that. >>> >>> Regards, >>> Christian. >> I've heard of that, but last I checked it required downstream patches >> to Xen, Linux, and QEMU. I don't know if any of those have been >> upstreamed since, but I believe that upstreaming the Xen and Linux >> patches (or rewriting them and upstreaming the rewritten version) would >> be necessary. Qubes OS (which I don't work for anymore but still want >> to help with this) almost certainly won't be using QEMU for GPU stuff. > > Yeah, our plan is to use xen-vhost-frontend[1] + vhost-device-gpu, ported/extended/modified as necessary. (I already have xen-vhost-frontend itself working on amd64 PVH with purely xenbus-based hotplug/configuration, currently working on cleaning up and submitting the necessary patches.) > > I'm curious to hear more details about how AMD has it working but last time I checked, there weren't any missing pieces in Xen or Linux that we'd need.. The AMD downstream changes were mostly related to QEMU. > > As for the memory management concerns, I would like to remind everyone once again that the pinning of GPU dmabufs in regular graphics workloads would be *very* short-term. In GPU paravirtualization (native contexts or venus or whatever else) the guest mostly operates on *opaque handles* that refer to buffers owned by the host GPU process. The typical rendering process (roughly) only involves submitting commands to the GPU that refer to memory using these handles. Only upon mmap() would a buffer be pinned/granted to the guest, and those are typically only used for *uploads* where the guest immediately does its memcpy() and unmaps the buffer. No that is not correct at all. CPU mapping a buffers for GPUs are pretty much permanent through the whole lifetime of the buffer. Otherwise you can completely forget running any halve way modern workloads. > So I'm not worried about (unintentionally) pinning too much GPU driver memory. > > In terms of deliberate denial-of-service attacks from the guest to the host, the only reasonable response is: > > ¯\_(ツ)_/¯ > > CPU-mapping lots of GPU memory is far from the only DoS vector, the GPU commands themselves can easily wedge the GPU core in a million ways (and last time I checked amdgpu was noooot so good at recovering from hangs). Yeah that is certainly true :) Regards, Christian. > > > [1]: https://github.com/vireshk/xen-vhost-frontend > > ~val > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-21 16:55 ` Val Packett 2026-04-21 17:43 ` Christian König @ 2026-04-22 1:27 ` Demi Marie Obenour 2026-04-22 2:03 ` Alex Deucher 1 sibling, 1 reply; 15+ messages in thread From: Demi Marie Obenour @ 2026-04-22 1:27 UTC (permalink / raw) To: Val Packett, Christian König, dri-devel, Xen developer discussion, linux-media Cc: Suwit Semal, Pelloux-Prayer, Pierre-Eric [-- Attachment #1.1.1: Type: text/plain, Size: 7660 bytes --] On 4/21/26 12:55, Val Packett wrote: > > On 4/20/26 4:12 PM, Demi Marie Obenour wrote: >> On 4/20/26 14:53, Christian König wrote: >>> On 4/20/26 20:46, Demi Marie Obenour wrote: >>>> On 4/20/26 13:58, Christian König wrote: >>>>> On 4/20/26 19:03, Demi Marie Obenour wrote: >>>>>> On 4/20/26 04:49, Christian König wrote: >>>>>>> On 4/17/26 21:35, Demi Marie Obenour wrote: >>>>> ... >>>>>>>> Are any of the following reasonable options? >>>>>>>> >>>>>>>> 1. Change the guest kernel to only map (and thus pin) a small subset >>>>>>>> of VRAM at any given time. If unmapped VRAM is accessed the guest >>>>>>>> traps the page fault, evicts an old VRAM mapping, and creates a >>>>>>>> new one. >>>>>>> Yeah, that could potentially work. >>>>>>> >>>>>>> This is basically what we do on the host kernel driver when we can't resize the BAR for some reason. In that use case VRAM buffers are shuffled in and out of the CPU accessible window of VRAM on demand. >>>>>> How much is this going to hurt performance? >>>>> Hard to say, resizing the BAR can easily give you 10-15% more performance on some use cases. >>>>> >>>>> But that involves physically transferring the data using a DMA. For this solution we basically only have to we basically only have to transfer a few messages between host and guest. >>>>> >>>>> No idea how performant that is. >>>> In this use-case, 20-30% performance penalties are likely to be >>>> "business as usual". >>> Well that is quite a bit. >>> >>>> Close to native performance would be ideal, but >>>> to be useful it just needs to beat software rendering by a wide margin, >>>> and not cause data corruption or vulnerabilities. >>> That should still easily be the case, even trivial use cases are multiple magnitudes faster on GPUs compared to software rendering. >> Makes sense. If only GPUs supported easy and flexible virtualization the way CPUs do :(. >> >>>>>>> But I have one question: When XEN has a problem handling faults from the guest on the host then how does that work for system memory mappings? >>>>>>> >>>>>>> There is really no difference between VRAM and system memory in the handling for the GPU driver stack. >>>>>>> >>>>>>> Regards, >>>>>>> Christian. >>>>>> Generally, Xen makes the frontend (usually an unprivileged VM) >>>>>> responsible for providing mappings to the backend (usually the host). >>>>>> That is possible with system RAM but not with VRAM, because Xen has >>>>>> no awareness of VRAM. To Xen, VRAM is just a PCI BAR. >>>>> No, that doesn't work with system memory allocations of GPU drivers either. >>>>> >>>>> We already had it multiple times that people tried to be clever and incremented the page reference counter on driver allocated system memory and were totally surprised that this can result in security issues and data corruption. >>>>> >>>>> I seriously hope that this isn't the case here again. As far as I know XEN already has support for accessing VMAs with VM_PFN or otherwise I don't know how driver allocated system memory access could potentially work. >>>>> >>>>> Accessing VRAM is pretty much the same use case as far as I can see. >>>>> >>>>> Regards, >>>>> Christian. >>>> The Xen-native approach would be for system memory allocations to >>>> be made using the Xen driver and then imported into the virtio-GPU >>>> driver via dmabuf. Is there any chance this could be made to happen? >>> That could be. Adding Pierre-Eric to comment since he knows that use much better than I do. >>> >>>> If it's a lost cause, then how much is the memory overhead of pinning >>>> everything ever used in a dmabuf? It should be possible to account >>>> pinned host memory against a guest's quota, but if that leads to an >>>> unusable system it isn't going to be good. >>> That won't work at all. >>> >>> We have use cases where you *must* migrate a DMA-buf to VRAM or otherwise the GPU can't use it. >>> >>> A simple scanout to a monitor is such an use case for example, that is usually not possible from system memory. >> Direct scanout isn't a concern here. >> >>>> Is supporting page faults in Xen the only solution that will be viable >>>> long-term, considering the tolerance for very substantial performance >>>> overheads compared to native? AAA gaming isn't the initial goal here. >>>> Qubes OS already supports PCI passthrough for that. >>> We have AAA gaming working on XEN through native context working for quite a while. >>> >>> Pierre-Eric can tell you more about that. >>> >>> Regards, >>> Christian. >> I've heard of that, but last I checked it required downstream patches >> to Xen, Linux, and QEMU. I don't know if any of those have been >> upstreamed since, but I believe that upstreaming the Xen and Linux >> patches (or rewriting them and upstreaming the rewritten version) would >> be necessary. Qubes OS (which I don't work for anymore but still want >> to help with this) almost certainly won't be using QEMU for GPU stuff. > > Yeah, our plan is to use xen-vhost-frontend[1] + vhost-device-gpu, > ported/extended/modified as necessary. (I already have > xen-vhost-frontend itself working on amd64 PVH with purely xenbus-based > hotplug/configuration, currently working on cleaning up and submitting > the necessary patches.) > > I'm curious to hear more details about how AMD has it working but last > time I checked, there weren't any missing pieces in Xen or Linux that > we'd need.. The AMD downstream changes were mostly related to QEMU. > > As for the memory management concerns, I would like to remind everyone > once again that the pinning of GPU dmabufs in regular graphics workloads > would be *very* short-term. In GPU paravirtualization (native contexts > or venus or whatever else) the guest mostly operates on *opaque handles* > that refer to buffers owned by the host GPU process. The typical > rendering process (roughly) only involves submitting commands to the GPU > that refer to memory using these handles. Only upon mmap() would a > buffer be pinned/granted to the guest, and those are typically only used > for *uploads* where the guest immediately does its memcpy() and unmaps > the buffer. > > So I'm not worried about (unintentionally) pinning too much GPU driver > memory. > > In terms of deliberate denial-of-service attacks from the guest to the > host, the only reasonable response is: > > ¯\_(ツ)_/¯ > > CPU-mapping lots of GPU memory is far from the only DoS vector, the GPU > commands themselves can easily wedge the GPU core in a million ways (and > last time I checked amdgpu was noooot so good at recovering from hangs). > > > [1]: https://github.com/vireshk/xen-vhost-frontend > > ~val I think it is best to handle things like GPU crashes by giving the guest some time to unmap its grants, and if that fails, crashing it. This should be done from a revoke callback, as afterwards the VRAM might get reused. Does amdgpu call revoke callbacks when the device is reset and VRAM is lost? It seems like it at least ought to. As an aside, Qubes needs to use the process isolation mode of the amdgpu driver. This means that only one process will be on the GPU at a time, so it _should_ be possible to blow away all GPU-resident state except VRAM without affecting other processes. Unfortunately, I think AMD GPUs might have HW or FW limitations that prevent that, at least on dGPUs. It might make sense to recommend KDE with GPU acceleration. KWin can recover from losing VRAM. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pinned, non-revocable mappings of VRAM: will bad things happen? 2026-04-22 1:27 ` Demi Marie Obenour @ 2026-04-22 2:03 ` Alex Deucher 0 siblings, 0 replies; 15+ messages in thread From: Alex Deucher @ 2026-04-22 2:03 UTC (permalink / raw) To: Demi Marie Obenour Cc: Val Packett, Christian König, dri-devel, Xen developer discussion, linux-media, Suwit Semal, Pelloux-Prayer, Pierre-Eric On Tue, Apr 21, 2026 at 9:34 PM Demi Marie Obenour <demiobenour@gmail.com> wrote: > > On 4/21/26 12:55, Val Packett wrote: > > > > On 4/20/26 4:12 PM, Demi Marie Obenour wrote: > >> On 4/20/26 14:53, Christian König wrote: > >>> On 4/20/26 20:46, Demi Marie Obenour wrote: > >>>> On 4/20/26 13:58, Christian König wrote: > >>>>> On 4/20/26 19:03, Demi Marie Obenour wrote: > >>>>>> On 4/20/26 04:49, Christian König wrote: > >>>>>>> On 4/17/26 21:35, Demi Marie Obenour wrote: > >>>>> ... > >>>>>>>> Are any of the following reasonable options? > >>>>>>>> > >>>>>>>> 1. Change the guest kernel to only map (and thus pin) a small subset > >>>>>>>> of VRAM at any given time. If unmapped VRAM is accessed the guest > >>>>>>>> traps the page fault, evicts an old VRAM mapping, and creates a > >>>>>>>> new one. > >>>>>>> Yeah, that could potentially work. > >>>>>>> > >>>>>>> This is basically what we do on the host kernel driver when we can't resize the BAR for some reason. In that use case VRAM buffers are shuffled in and out of the CPU accessible window of VRAM on demand. > >>>>>> How much is this going to hurt performance? > >>>>> Hard to say, resizing the BAR can easily give you 10-15% more performance on some use cases. > >>>>> > >>>>> But that involves physically transferring the data using a DMA. For this solution we basically only have to we basically only have to transfer a few messages between host and guest. > >>>>> > >>>>> No idea how performant that is. > >>>> In this use-case, 20-30% performance penalties are likely to be > >>>> "business as usual". > >>> Well that is quite a bit. > >>> > >>>> Close to native performance would be ideal, but > >>>> to be useful it just needs to beat software rendering by a wide margin, > >>>> and not cause data corruption or vulnerabilities. > >>> That should still easily be the case, even trivial use cases are multiple magnitudes faster on GPUs compared to software rendering. > >> Makes sense. If only GPUs supported easy and flexible virtualization the way CPUs do :(. > >> > >>>>>>> But I have one question: When XEN has a problem handling faults from the guest on the host then how does that work for system memory mappings? > >>>>>>> > >>>>>>> There is really no difference between VRAM and system memory in the handling for the GPU driver stack. > >>>>>>> > >>>>>>> Regards, > >>>>>>> Christian. > >>>>>> Generally, Xen makes the frontend (usually an unprivileged VM) > >>>>>> responsible for providing mappings to the backend (usually the host). > >>>>>> That is possible with system RAM but not with VRAM, because Xen has > >>>>>> no awareness of VRAM. To Xen, VRAM is just a PCI BAR. > >>>>> No, that doesn't work with system memory allocations of GPU drivers either. > >>>>> > >>>>> We already had it multiple times that people tried to be clever and incremented the page reference counter on driver allocated system memory and were totally surprised that this can result in security issues and data corruption. > >>>>> > >>>>> I seriously hope that this isn't the case here again. As far as I know XEN already has support for accessing VMAs with VM_PFN or otherwise I don't know how driver allocated system memory access could potentially work. > >>>>> > >>>>> Accessing VRAM is pretty much the same use case as far as I can see. > >>>>> > >>>>> Regards, > >>>>> Christian. > >>>> The Xen-native approach would be for system memory allocations to > >>>> be made using the Xen driver and then imported into the virtio-GPU > >>>> driver via dmabuf. Is there any chance this could be made to happen? > >>> That could be. Adding Pierre-Eric to comment since he knows that use much better than I do. > >>> > >>>> If it's a lost cause, then how much is the memory overhead of pinning > >>>> everything ever used in a dmabuf? It should be possible to account > >>>> pinned host memory against a guest's quota, but if that leads to an > >>>> unusable system it isn't going to be good. > >>> That won't work at all. > >>> > >>> We have use cases where you *must* migrate a DMA-buf to VRAM or otherwise the GPU can't use it. > >>> > >>> A simple scanout to a monitor is such an use case for example, that is usually not possible from system memory. > >> Direct scanout isn't a concern here. > >> > >>>> Is supporting page faults in Xen the only solution that will be viable > >>>> long-term, considering the tolerance for very substantial performance > >>>> overheads compared to native? AAA gaming isn't the initial goal here. > >>>> Qubes OS already supports PCI passthrough for that. > >>> We have AAA gaming working on XEN through native context working for quite a while. > >>> > >>> Pierre-Eric can tell you more about that. > >>> > >>> Regards, > >>> Christian. > >> I've heard of that, but last I checked it required downstream patches > >> to Xen, Linux, and QEMU. I don't know if any of those have been > >> upstreamed since, but I believe that upstreaming the Xen and Linux > >> patches (or rewriting them and upstreaming the rewritten version) would > >> be necessary. Qubes OS (which I don't work for anymore but still want > >> to help with this) almost certainly won't be using QEMU for GPU stuff. > > > > Yeah, our plan is to use xen-vhost-frontend[1] + vhost-device-gpu, > > ported/extended/modified as necessary. (I already have > > xen-vhost-frontend itself working on amd64 PVH with purely xenbus-based > > hotplug/configuration, currently working on cleaning up and submitting > > the necessary patches.) > > > > I'm curious to hear more details about how AMD has it working but last > > time I checked, there weren't any missing pieces in Xen or Linux that > > we'd need.. The AMD downstream changes were mostly related to QEMU. > > > > As for the memory management concerns, I would like to remind everyone > > once again that the pinning of GPU dmabufs in regular graphics workloads > > would be *very* short-term. In GPU paravirtualization (native contexts > > or venus or whatever else) the guest mostly operates on *opaque handles* > > that refer to buffers owned by the host GPU process. The typical > > rendering process (roughly) only involves submitting commands to the GPU > > that refer to memory using these handles. Only upon mmap() would a > > buffer be pinned/granted to the guest, and those are typically only used > > for *uploads* where the guest immediately does its memcpy() and unmaps > > the buffer. > > > > So I'm not worried about (unintentionally) pinning too much GPU driver > > memory. > > > > In terms of deliberate denial-of-service attacks from the guest to the > > host, the only reasonable response is: > > > > ¯\_(ツ)_/¯ > > > > CPU-mapping lots of GPU memory is far from the only DoS vector, the GPU > > commands themselves can easily wedge the GPU core in a million ways (and > > last time I checked amdgpu was noooot so good at recovering from hangs). > > > > > > [1]: https://github.com/vireshk/xen-vhost-frontend > > > > ~val > > I think it is best to handle things like GPU crashes by giving the guest > some time to unmap its grants, and if that fails, crashing it. This should > be done from a revoke callback, as afterwards the VRAM might get reused. > > Does amdgpu call revoke callbacks when the device is reset and VRAM > is lost? It seems like it at least ought to. > > As an aside, Qubes needs to use the process isolation mode of the > amdgpu driver. This means that only one process will be on the GPU > at a time, so it _should_ be possible to blow away all GPU-resident > state except VRAM without affecting other processes. Unfortunately, > I think AMD GPUs might have HW or FW limitations that prevent that, > at least on dGPUs. The driver has supported per queue resets for a few kernel releases now so only the bad app would be affected in that case. Alex > > It might make sense to recommend KDE with GPU acceleration. KWin can > recover from losing VRAM. > -- > Sincerely, > Demi Marie Obenour (she/her/hers) ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2026-04-22 2:04 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-15 23:27 Pinned, non-revocable mappings of VRAM: will bad things happen? Demi Marie Obenour 2026-04-16 9:57 ` Christian König 2026-04-16 16:13 ` Demi Marie Obenour 2026-04-17 7:53 ` Christian König 2026-04-17 19:35 ` Demi Marie Obenour 2026-04-20 8:49 ` Christian König 2026-04-20 17:03 ` Demi Marie Obenour 2026-04-20 17:58 ` Christian König 2026-04-20 18:46 ` Demi Marie Obenour 2026-04-20 18:53 ` Christian König 2026-04-20 19:12 ` Demi Marie Obenour 2026-04-21 16:55 ` Val Packett 2026-04-21 17:43 ` Christian König 2026-04-22 1:27 ` Demi Marie Obenour 2026-04-22 2:03 ` Alex Deucher
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox