From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Chen, Tiejun" Subject: Re: [v3][PATCH 04/16] xen/passthrough: extend hypercall to support rdm reservation policy Date: Fri, 12 Jun 2015 14:31:03 +0800 Message-ID: <557A7CA7.9000305@intel.com> References: <1433985325-16676-1-git-send-email-tiejun.chen@intel.com> <1433985325-16676-5-git-send-email-tiejun.chen@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Tian, Kevin" , "jbeulich@suse.com" , "tim@xen.org" , "andrew.cooper3@citrix.com" , "Zhang, Yang Z" , "wei.liu2@citrix.com" , "ian.campbell@citrix.com" , "Ian.Jackson@eu.citrix.com" , "stefano.stabellini@citrix.com" Cc: "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 2015/6/11 17:28, Tian, Kevin wrote: >> From: Chen, Tiejun >> Sent: Thursday, June 11, 2015 9:15 AM >> >> This patch extends the existing hypercall to support rdm reservation policy. >> We return error or just throw out a warning message depending on whether >> the policy is "strict" or "relaxed" when reserving RDM regions in pfn space. >> Note in some special cases, e.g. add a device to hwdomain, and remove a >> device from user domain, 'relaxed' is fine enough since this is always safe >> to hwdomain. > > could you elaborate " add a device to hwdomain, and remove a device > from user domain "? move a device from user domain to hwdomain > or completely irrelevant? Yes, they're not relevant. And I think we shouldn't care our policy, #1. When add a device to hwdomain I think RMRR is always reserved on e820 so either of flag is fine. #2. remove a device from domain "remove" action also can ignore that since the original mechanism is enough. > >> >> Signed-off-by: Tiejun Chen >> --- >> xen/arch/x86/mm/p2m.c | 8 +++++++- >> xen/drivers/passthrough/amd/pci_amd_iommu.c | 3 ++- >> xen/drivers/passthrough/arm/smmu.c | 2 +- >> xen/drivers/passthrough/device_tree.c | 11 ++++++++++- >> xen/drivers/passthrough/pci.c | 10 ++++++---- >> xen/drivers/passthrough/vtd/iommu.c | 20 ++++++++++++-------- >> xen/include/asm-x86/p2m.h | 2 +- >> xen/include/public/domctl.h | 5 +++++ >> xen/include/xen/iommu.h | 2 +- >> 9 files changed, 45 insertions(+), 18 deletions(-) >> >> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c >> index c7198a5..3fcdcac 100644 >> --- a/xen/arch/x86/mm/p2m.c >> +++ b/xen/arch/x86/mm/p2m.c >> @@ -899,7 +899,7 @@ int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, >> mfn_t mfn, >> } >> >> int set_identity_p2m_entry(struct domain *d, unsigned long gfn, >> - p2m_access_t p2ma) >> + p2m_access_t p2ma, u32 flag) >> { >> p2m_type_t p2mt; >> p2m_access_t a; >> @@ -924,6 +924,12 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn, >> printk(XENLOG_G_WARNING >> "Cannot identity map d%d:%lx, already mapped to %lx.\n", >> d->domain_id, gfn, mfn_x(mfn)); >> + >> + if ( flag == XEN_DOMCTL_DEV_RDM_RELAXED ) >> + { >> + ret = 0; >> + printk(XENLOG_G_WARNING "Some devices may work failed.\n"); > > Do you need this extra printk? The warning message is already given > several lines above and here you just need to change return value > for relaxed policy. Okay. > >> + } >> } >> >> gfn_unlock(p2m, gfn, 0); >> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c >> b/xen/drivers/passthrough/amd/pci_amd_iommu.c >> index e83bb35..920b35a 100644 >> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c >> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c >> @@ -394,7 +394,8 @@ static int reassign_device(struct domain *source, struct domain >> *target, >> } >> >> static int amd_iommu_assign_device(struct domain *d, u8 devfn, >> - struct pci_dev *pdev) >> + struct pci_dev *pdev, >> + u32 flag) >> { >> struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(pdev->seg); >> int bdf = PCI_BDF2(pdev->bus, devfn); >> diff --git a/xen/drivers/passthrough/arm/smmu.c >> b/xen/drivers/passthrough/arm/smmu.c >> index 6cc4394..9a667e9 100644 >> --- a/xen/drivers/passthrough/arm/smmu.c >> +++ b/xen/drivers/passthrough/arm/smmu.c >> @@ -2605,7 +2605,7 @@ static void arm_smmu_destroy_iommu_domain(struct >> iommu_domain *domain) >> } >> >> static int arm_smmu_assign_dev(struct domain *d, u8 devfn, >> - struct device *dev) >> + struct device *dev, u32 flag) >> { >> struct iommu_domain *domain; >> struct arm_smmu_xen_domain *xen_domain; >> diff --git a/xen/drivers/passthrough/device_tree.c >> b/xen/drivers/passthrough/device_tree.c >> index 5d3842a..ea85645 100644 >> --- a/xen/drivers/passthrough/device_tree.c >> +++ b/xen/drivers/passthrough/device_tree.c >> @@ -52,7 +52,8 @@ int iommu_assign_dt_device(struct domain *d, struct >> dt_device_node *dev) >> goto fail; >> } >> >> - rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev)); >> + rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev), >> + XEN_DOMCTL_DEV_NO_RDM); >> >> if ( rc ) >> goto fail; >> @@ -148,6 +149,14 @@ int iommu_do_dt_domctl(struct xen_domctl *domctl, struct >> domain *d, >> if ( domctl->u.assign_device.dev != XEN_DOMCTL_DEV_DT ) >> break; >> >> + if ( domctl->u.assign_device.flag == XEN_DOMCTL_DEV_NO_RDM ) >> + { >> + printk(XENLOG_G_ERR "XEN_DOMCTL_assign_device: assign \"%s\"" >> + " to dom%u failed (%d) since we don't support RDM.\n", >> + dt_node_full_name(dev), d->domain_id, ret); >> + break; >> + } >> + >> if ( unlikely(d->is_dying) ) >> { >> ret = -EINVAL; >> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c >> index e30be43..557c87e 100644 >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -1335,7 +1335,7 @@ static int device_assigned(u16 seg, u8 bus, u8 devfn) >> return pdev ? 0 : -EBUSY; >> } >> >> -static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn) >> +static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag) >> { >> struct hvm_iommu *hd = domain_hvm_iommu(d); >> struct pci_dev *pdev; >> @@ -1371,7 +1371,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 >> devfn) >> >> pdev->fault.count = 0; >> >> - if ( (rc = hd->platform_ops->assign_device(d, devfn, pci_to_dev(pdev))) ) >> + if ( (rc = hd->platform_ops->assign_device(d, devfn, pci_to_dev(pdev), flag)) ) >> goto done; >> >> for ( ; pdev->phantom_stride; rc = 0 ) >> @@ -1379,7 +1379,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 >> devfn) >> devfn += pdev->phantom_stride; >> if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) ) >> break; >> - rc = hd->platform_ops->assign_device(d, devfn, pci_to_dev(pdev)); >> + rc = hd->platform_ops->assign_device(d, devfn, pci_to_dev(pdev), flag); >> if ( rc ) >> printk(XENLOG_G_WARNING "d%d: assign %04x:%02x:%02x.%u failed >> (%d)\n", >> d->domain_id, seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), >> @@ -1496,6 +1496,7 @@ int iommu_do_pci_domctl( >> { >> u16 seg; >> u8 bus, devfn; >> + u32 flag; >> int ret = 0; >> uint32_t machine_sbdf; >> >> @@ -1577,9 +1578,10 @@ int iommu_do_pci_domctl( >> seg = machine_sbdf >> 16; >> bus = PCI_BUS(machine_sbdf); >> devfn = PCI_DEVFN2(machine_sbdf); >> + flag = domctl->u.assign_device.flag; >> >> ret = device_assigned(seg, bus, devfn) ?: >> - assign_device(d, seg, bus, devfn); >> + assign_device(d, seg, bus, devfn, flag); >> if ( ret == -ERESTART ) >> ret = hypercall_create_continuation(__HYPERVISOR_domctl, >> "h", u_domctl); >> diff --git a/xen/drivers/passthrough/vtd/iommu.c >> b/xen/drivers/passthrough/vtd/iommu.c >> index 31ce1af..d7c9e1c 100644 >> --- a/xen/drivers/passthrough/vtd/iommu.c >> +++ b/xen/drivers/passthrough/vtd/iommu.c >> @@ -1808,7 +1808,8 @@ static void iommu_set_pgd(struct domain *d) >> } >> >> static int rmrr_identity_mapping(struct domain *d, bool_t map, >> - const struct acpi_rmrr_unit *rmrr) >> + const struct acpi_rmrr_unit *rmrr, >> + u32 flag) >> { >> unsigned long base_pfn = rmrr->base_address >> PAGE_SHIFT_4K; >> unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >> >> PAGE_SHIFT_4K; >> @@ -1856,7 +1857,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map, >> >> while ( base_pfn < end_pfn ) >> { >> - int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw); >> + int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag); >> >> if ( err ) >> return err; >> @@ -1899,7 +1900,8 @@ static int intel_iommu_add_device(u8 devfn, struct pci_dev >> *pdev) >> PCI_BUS(bdf) == pdev->bus && >> PCI_DEVFN2(bdf) == devfn ) >> { >> - ret = rmrr_identity_mapping(pdev->domain, 1, rmrr); >> + ret = rmrr_identity_mapping(pdev->domain, 1, rmrr, >> + XEN_DOMCTL_DEV_RDM_RELAXED); > > Why did you hardcode relax policy here? Shouldn't the policy come > from hypercall flag? I just saw we have one path to use intel_iommu_add_device(), pci_add_device() | + if ( !pdev->domain ) { pdev->domain = hardware_domain; ret = iommu_add_device(pdev); | + hd->platform_ops->add_device() | + intel_iommu_add_device() So I think intel_iommu_add_device() is used to add a device to hardware_domain. And in our case hardware_domain should be special as I explained above. > >> if ( ret ) >> dprintk(XENLOG_ERR VTDPREFIX, "d%d: RMRR mapping failed\n", >> pdev->domain->domain_id); >> @@ -1940,7 +1942,8 @@ static int intel_iommu_remove_device(u8 devfn, struct pci_dev >> *pdev) >> PCI_DEVFN2(bdf) != devfn ) >> continue; >> >> - rmrr_identity_mapping(pdev->domain, 0, rmrr); >> + rmrr_identity_mapping(pdev->domain, 0, rmrr, >> + XEN_DOMCTL_DEV_RDM_RELAXED); > > ditto It doesn't matter when we're trying to remove a device since we don't care this flag. > >> } >> >> return domain_context_unmap(pdev->domain, devfn, pdev); >> @@ -2098,7 +2101,7 @@ static void __hwdom_init setup_hwdom_rmrr(struct domain *d) >> spin_lock(&pcidevs_lock); >> for_each_rmrr_device ( rmrr, bdf, i ) >> { >> - ret = rmrr_identity_mapping(d, 1, rmrr); >> + ret = rmrr_identity_mapping(d, 1, rmrr, XEN_DOMCTL_DEV_RDM_RELAXED); >> if ( ret ) >> dprintk(XENLOG_ERR VTDPREFIX, >> "IOMMU: mapping reserved region failed\n"); >> @@ -2241,7 +2244,8 @@ static int reassign_device_ownership( >> PCI_BUS(bdf) == pdev->bus && >> PCI_DEVFN2(bdf) == devfn ) >> { >> - ret = rmrr_identity_mapping(source, 0, rmrr); >> + ret = rmrr_identity_mapping(source, 0, rmrr, >> + XEN_DOMCTL_DEV_RDM_RELAXED); >> if ( ret != -ENOENT ) >> return ret; >> } >> @@ -2265,7 +2269,7 @@ static int reassign_device_ownership( >> } >> >> static int intel_iommu_assign_device( >> - struct domain *d, u8 devfn, struct pci_dev *pdev) >> + struct domain *d, u8 devfn, struct pci_dev *pdev, u32 flag) >> { >> struct acpi_rmrr_unit *rmrr; >> int ret = 0, i; >> @@ -2294,7 +2298,7 @@ static int intel_iommu_assign_device( >> PCI_BUS(bdf) == bus && >> PCI_DEVFN2(bdf) == devfn ) >> { >> - ret = rmrr_identity_mapping(d, 1, rmrr); >> + ret = rmrr_identity_mapping(d, 1, rmrr, flag); >> if ( ret ) >> { >> reassign_device_ownership(d, hardware_domain, devfn, pdev); >> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h >> index 95b6266..a80b4f8 100644 >> --- a/xen/include/asm-x86/p2m.h >> +++ b/xen/include/asm-x86/p2m.h >> @@ -545,7 +545,7 @@ int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, >> mfn_t mfn); >> >> /* Set identity addresses in the p2m table (for pass-through) */ >> int set_identity_p2m_entry(struct domain *d, unsigned long gfn, >> - p2m_access_t p2ma); >> + p2m_access_t p2ma, u32 flag); >> >> /* Add foreign mapping to the guest's p2m table. */ >> int p2m_add_foreign(struct domain *tdom, unsigned long fgfn, >> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h >> index bc45ea5..2f9e40e 100644 >> --- a/xen/include/public/domctl.h >> +++ b/xen/include/public/domctl.h >> @@ -478,6 +478,11 @@ struct xen_domctl_assign_device { >> XEN_GUEST_HANDLE_64(char) path; /* path to the device tree node */ >> } dt; >> } u; >> + /* IN */ >> +#define XEN_DOMCTL_DEV_NO_RDM 0 >> +#define XEN_DOMCTL_DEV_RDM_RELAXED 1 >> +#define XEN_DOMCTL_DEV_RDM_STRICT 2 > > I don't understand why we require a NO_RDM flag. Whether there > is RDM associated with a given device, it's reported by system > BIOS or through cmdline extension in coming patch. Why do we > require the hypercall to ask NO_RDM to hypervisor? The only flags > we want to pass to hypervisor is relaxed/strict policy, so hypervisor > know whether to fail or warn upon caught conflicts of identity > mapping... > This is just introduced to ARM as we discussed previously since we're touching some common interfaces. Thanks Tiejun