xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH][VT-d] Dis-allow PCI device assignment if PoD is enabled
@ 2010-01-21 12:28 Xu, Dongxiao
  2010-01-21 13:45 ` George Dunlap
  0 siblings, 1 reply; 4+ messages in thread
From: Xu, Dongxiao @ 2010-01-21 12:28 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com
  Cc: George.Dunlap@eu.citrix.com, Han, Weidong, Keir Fraser,
	Cui, Dexuan

[-- Attachment #1: Type: text/plain, Size: 475 bytes --]

It seems that currently we don't have any code to handle
the coexistence of VT-d and PoD. VT-d engine needs to set up
the entire page table for the domain. However if PoD is enabled,
un-populated memory is marked as populate_on_demand, and
VT-d engine won't set up page tables for them. Therefore any
DMA towards those memory may cause DMA fault.
	So for safety concern, its better to dis-allow PCI device
assignment if PoD is enabled.

Best Regards, 
-- Dongxiao

[-- Attachment #2: pod_disable_dev_assign.patch --]
[-- Type: application/octet-stream, Size: 1843 bytes --]

Dis-allow device assignment if PoD is enabled. 

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>

diff -r b0b41e735575 tools/python/xen/xend/XendDomainInfo.py
--- a/tools/python/xen/xend/XendDomainInfo.py	Wed Jan 20 09:51:38 2010 +0000
+++ b/tools/python/xen/xend/XendDomainInfo.py	Thu Jan 21 20:08:57 2010 +0800
@@ -390,6 +390,14 @@ class XendDomainInfo:
             self.domid = domid
         self.guest_bitsize = None
         self.alloc_mem = None
+
+        maxmem = self.info.get('memory_static_max', 0)
+        memory = self.info.get('memory_dynamic_max', 0)
+
+        if maxmem > memory:
+            self.pod_enabled = True
+        else:
+            self.pod_enabled = False
         
         #REMOVE: uuid is now generated in XendConfig
         #if not self._infoIsSet('uuid'):
@@ -694,11 +702,18 @@ class XendDomainInfo:
 
         return self.hvm_pci_device_insert_dev(new_dev)
 
+    def iommu_check_pod_mode(self):
+        """ Disallow PCI device assignment if pod is enabled. """
+        if self.pod_enabled:
+            raise VmError("failed to assign device since pod is enabled")
+
     def pci_dev_check_assignability_and_do_FLR(self, config):
         """ In the case of static device assignment(i.e., the 'pci' string in
         guest config file), we check if the device(s) specified in the 'pci'
         can be  assigned to guest or not; if yes, we do_FLR the device(s).
         """
+
+        self.iommu_check_pod_mode()
         pci_dev_ctrl = self.getDeviceController('pci')
         return pci_dev_ctrl.dev_check_assignability_and_do_FLR(config)
 
@@ -707,6 +722,8 @@ class XendDomainInfo:
         check if the device can be attached to guest or not; if yes, we do_FLR
         the device.
         """
+
+        self.iommu_check_pod_mode()
 
         # Test whether the devices can be assigned
 

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH][VT-d] Dis-allow PCI device assignment if PoD is enabled
  2010-01-21 12:28 [PATCH][VT-d] Dis-allow PCI device assignment if PoD is enabled Xu, Dongxiao
@ 2010-01-21 13:45 ` George Dunlap
  2010-01-21 18:02   ` Ian Pratt
  0 siblings, 1 reply; 4+ messages in thread
From: George Dunlap @ 2010-01-21 13:45 UTC (permalink / raw)
  To: Xu, Dongxiao
  Cc: Keir, xen-devel@lists.xensource.com, Han, Weidong, Fraser,
	Cui, Dexuan

Seems like a good "seatbelt" for 4.0.

Looking forward, what would it take to make PoD and VT-d coexist?  We 
only need PoD during boot, until the balloon driver comes up and 
balloons down the guest's memory.  Three solutions come to mind, but as 
I don't know the constraints of VT-d, I don't know which is feasible (if 
any):
* Redo the VT-d mapping every time the p2m map changes as a result of PoD.
* While PoD pages exist, intercept device commands, and redo the VT-d 
map if the page was marked PoD the last time we updated the VT-d map
* Detect DMA faults, instantiate the page if necessary, update the VT-d 
map, and re-start the transaction.

How expensive is it to change the VT-d pagetable?  Is a DMA fault 
re-startable?  i.e., could we take a fault, redo the VT-d map, and 
re-issue the DMA request?

 -George

Xu, Dongxiao wrote:
> It seems that currently we don't have any code to handle
> the coexistence of VT-d and PoD. VT-d engine needs to set up
> the entire page table for the domain. However if PoD is enabled,
> un-populated memory is marked as populate_on_demand, and
> VT-d engine won't set up page tables for them. Therefore any
> DMA towards those memory may cause DMA fault.
> 	So for safety concern, its better to dis-allow PCI device
> assignment if PoD is enabled.
>
> Best Regards, 
> -- Dongxiao
>   

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH][VT-d] Dis-allow PCI device assignment if PoD is enabled
  2010-01-21 13:45 ` George Dunlap
@ 2010-01-21 18:02   ` Ian Pratt
  2010-01-22 12:17     ` George Dunlap
  0 siblings, 1 reply; 4+ messages in thread
From: Ian Pratt @ 2010-01-21 18:02 UTC (permalink / raw)
  To: George Dunlap, Xu, Dongxiao
  Cc: Ian, Cui, Dexuan, Han, Weidong, Pratt,
	xen-devel@lists.xensource.com, Fraser

> Looking forward, what would it take to make PoD and VT-d coexist?  We
> only need PoD during boot, until the balloon driver comes up and
> balloons down the guest's memory.  Three solutions come to mind, but as
> I don't know the constraints of VT-d, I don't know which is feasible (if
> any):
> * Redo the VT-d mapping every time the p2m map changes as a result of PoD.
> * While PoD pages exist, intercept device commands, and redo the VT-d
> map if the page was marked PoD the last time we updated the VT-d map
> * Detect DMA faults, instantiate the page if necessary, update the VT-d
> map, and re-start the transaction.
> 
> How expensive is it to change the VT-d pagetable?  Is a DMA fault
> re-startable?  i.e., could we take a fault, redo the VT-d map, and
> re-issue the DMA request?

IOMMU faults are not restartable, at least currently.

Flushing the IOTLB is very expensive. Fortunately -ve entries are not cached, and since PoD mainly adds new mappings we should be fine. The zeroed page reclaimation stuff is a bit more dicey and would require syncrhonization against an IOTLB flush before returning the pages to xen.

In many/most cases the device will not be in use that early in boot, so it's a bit annoying to have to do maintain the IOMMU pagetables through PoD, but unavoidable. The key thing is that we only have to do it for domains that actually have devices passed-through.

Ian

 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH][VT-d] Dis-allow PCI device assignment if PoD is enabled
  2010-01-21 18:02   ` Ian Pratt
@ 2010-01-22 12:17     ` George Dunlap
  0 siblings, 0 replies; 4+ messages in thread
From: George Dunlap @ 2010-01-22 12:17 UTC (permalink / raw)
  To: Ian Pratt
  Cc: Xu, Dongxiao, xen-devel@lists.xensource.com, Han, Weidong, Fraser,
	Cui, Dexuan

On Thu, Jan 21, 2010 at 6:02 PM, Ian Pratt <Ian.Pratt@eu.citrix.com> wrote:
> In many/most cases the device will not be in use that early in boot, so it's a bit annoying to have to do maintain the IOMMU pagetables through PoD, but unavoidable. The key thing is that we only have to do it for domains that actually have devices passed-through.

At the moment, I can't imagine how IOMMU/VT-d can interact well with
PoD during boot, before the balloon driver gets in and does its thing.
It's guaranteed during that time that a high percentage of the memory
which the guest thinks it has free will be not-present in the p2m.
There's no way we can predict which gfns will be passed to the device;
having been zeroed (and thus populated) is no help, since a
non-negligible percentage of zeroed pages will need to be reclaimed
for the PoD pool again anyway.

If it really is true that devices aren't used during boot, then we
could simply have the balloon driver / the tools do a final "sync"
once the "target" has been reached (and outstanding PoD entries ==
size of PoD memory pool).  Doing more than that (say, syncing on every
p2m update) doesn't solve the problem (although I suppose it may be
necessary to prevent corruption).

 -George

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-01-22 12:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-21 12:28 [PATCH][VT-d] Dis-allow PCI device assignment if PoD is enabled Xu, Dongxiao
2010-01-21 13:45 ` George Dunlap
2010-01-21 18:02   ` Ian Pratt
2010-01-22 12:17     ` George Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).