From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Chen, Tiejun" Subject: Re: [RFC][v2][PATCH 01/14] tools: introduce some new parameters to set rdm policy Date: Wed, 03 Jun 2015 09:35:16 +0800 Message-ID: <556E59D4.6080804@intel.com> References: <1432287314-4388-1-git-send-email-tiejun.chen@intel.com> <1432287314-4388-2-git-send-email-tiejun.chen@intel.com> <20150602155715.GT19403@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150602155715.GT19403@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: kevin.tian@intel.com, ian.campbell@citrix.com, andrew.cooper3@citrix.com, tim@xen.org, xen-devel@lists.xen.org, stefano.stabellini@citrix.com, JBeulich@suse.com, yang.z.zhang@intel.com, Ian.Jackson@eu.citrix.com List-Id: xen-devel@lists.xenproject.org >> +=item B > > Stray space after before and after "RDM_RESERVE_STRING". Sure, =item B > >> + >> +(HVM/x86 only) Specifies the information about Reserved Device Memory (RDM), >> +which is necessary to enable robust device passthrough usage. One example of > > Delete "usage". Okay. > >> +RDM is reported through ACPI Reserved Memory Region Reporting (RMRR) >> +structure on x86 platform. >> + >> +B has the form C<[KEY=VALUE,KEY=VALUE,...> where: >> + >> +=over 4 >> + >> +=item B >> + >> +Possible Bs are: >> + >> +=over 4 >> + >> +=item B >> + >> +Currently we just have two types: >> + >> +"host" means all reserved device memory on this platform should be reserved >> +in this VM's pfn space. This global RDM parameter allows user to specify > > PFN is Xen internal terminology. Do you mean "guest address space"? Note > that the reader is system administrators who might not know / want to > know Xen internals. Sure. > >> +reserved regions explicitly. And using "host" to include all reserved regions >> +reported on this platform which is good to handle hotplug scenario. In the >> +future this parameter may be further extended to allow specifying random >> +regions, e.g. even those belonging to another platform as a preparation > > Extending how? What's your envisaged syntax for those random regions? We didn't go into details while discussing that design. Maybe we can do something like this, rdm="type=host,reserve=strict,rdm_add=size[KMG][@offset[KMG]],size[KMG][@offset[KMG]],..." > Should you want to reserve more, an array is more useful. Could you Yeah. > provide some examples? But we may have alternative approach to this when I noticed some guys are trying to delivery some patches about setting rmrr region by xen commandline. So I also would like to check this likelihood when we can step forward. > >> +for live migration with passthrough devices. >> + >> +"none" means we have nothing to do all reserved regions and ignore all policies, >> +so guest work as before. >> + >> +=over 4 >> + >> +=item B >> + >> +Conflict may be detected when reserving reserved device memory in gfn space. > > GFN is a Xen internal terminology. Maybe you should use "guest address > space"? > > Nonetheless the terminology throughout this document should be > consistent. Sure, so I will do this, s/pfn/guest address space/g s/gfn/guest address space/g > >> +"strict" means an unsolved conflict leads to immediate VM crash, while >> +"relaxed" allows VM moving forward with a warning message thrown out. "relaxed" >> +is default. >> + >> +Note this may be overrided by another sub item, rdm_reserve, in pci device. >> + > > "overridden by rdm_reserve option in PCI device configuration". Okay. > >> =item B >> >> Specifies the host PCI devices to passthrough to this guest. Each B >> @@ -707,6 +750,20 @@ dom0 without confirmation. Please use with care. >> D0-D3hot power management states for the PCI device. False (0) by >> default. >> >> +=item B >> + >> +(HVM/x86 only) Specifies the information about Reserved Device Memory (RDM), >> +which is necessary to enable robust device passthrough usage. One example of > > Delete "usage". > >> +RDM is reported through ACPI Reserved Memory Region Reporting (RMRR) >> +structure on x86 platform. >> + >> +Conflict may be detected when reserving reserved device memory in gfn space. >> +"strict" means an unsolved conflict leads to immediate VM crash, while >> +"relaxed" allows VM moving forward with a warning message thrown out. "strict" >> +is default. >> + > > Actually these two paragraphs are the same as before. You can just point > readers to previous sections instead of copying them here. So instead, (HVM/x86 only) This is same as reserve option above but just specific to a given device, and "strict" is default here. > >> +Note this would override global B option. >> + >> =back >> >> =back >> diff --git a/docs/misc/vtd.txt b/docs/misc/vtd.txt >> index 9af0e99..7d63c47 100644 >> --- a/docs/misc/vtd.txt >> +++ b/docs/misc/vtd.txt >> @@ -111,6 +111,30 @@ in the config file: >> To override for a specific device: >> pci = [ '01:00.0,msitranslate=0', '03:00.0' ] >> >> +RDM, 'reserved device memory', for PCI Device Passthrough >> +--------------------------------------------------------- >> + >> +There are some devices the BIOS controls, for e.g. USB devices to perform >> +PS2 emulation. The regions of memory used for these devices are marked >> +reserved in the e820 map. When we turn on DMA translation, DMA to those >> +regions will fail. Hence BIOS uses RMRR to specify these regions along with >> +devices that need to access these regions. OS is expected to setup >> +identity mappings for these regions for these devices to access these regions. >> + >> +While creating a VM we should reserve them in advance, and avoid any conflicts. >> +So we introduce user configurable parameters to specify RDM resource and >> +according policies, >> + >> +To enable this globally, add "rdm" in the config file: >> + >> + rdm = "type=host, reserve=relaxed" (default policy is "relaxed") >> + >> +Or just for a specific device: >> + >> + pci = [ '01:00.0,rdm_reserve=relaxed', '03:00.0,rdm_reserve=strict' ] >> + >> +For all the options available to RDM, see xl.cfg(5). >> + >> >> Caveat on Conventional PCI Device Passthrough >> --------------------------------------------- >> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c >> index f0da7dc..d649ead 100644 >> --- a/tools/libxl/libxl_create.c >> +++ b/tools/libxl/libxl_create.c >> @@ -100,6 +100,12 @@ static int sched_params_valid(libxl__gc *gc, >> return 1; >> } >> >> +void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info) >> +{ >> + b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_NONE; Based on our previous discussion, I will initial this firstly, +libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [ + (0, "none"), + (1, "host"), + ], init_val = "LIBXL_RDM_RESERVE_TYPE_NONE") + and then, I would remove this line since right now we just own two options, "none" or "host". And both they're fine. >> + b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_RELAXED; > > No, not like this. You set everything back to none and relaxed even if > it is set before this point. > > It should be > if (xxx == DEFAULT_SENTINEL_VALUE) > xxx = THE_DEFAULT_YOU_WANT; > > Have a look at libxl__device_nic_setdefault etc to get an idea > how it works. Don't hesitate to ask if I'm not clear enough. But indeed, here we should set rdm.reserve as you said, +void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info) +{ + if (b_info->rdm.reserve == LIBXL_RDM_RESERVE_FLAG_INVALID) + b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_RELAXED; +} + > >> +} >> + >> int libxl__domain_build_info_setdefault(libxl__gc *gc, >> libxl_domain_build_info *b_info) >> { >> @@ -410,6 +416,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, >> libxl_domain_type_to_string(b_info->type)); >> return ERROR_INVAL; >> } >> + >> + libxl__rdm_setdefault(gc, b_info); >> return 0; >> } >> >> @@ -1439,6 +1447,11 @@ static void domcreate_attach_pci(libxl__egc *egc, libxl__multidev *multidev, >> } >> >> for (i = 0; i < d_config->num_pcidevs; i++) { >> + /* >> + * If the rdm global policy is 'force' we should override each device. >> + */ > > "strict" not "force" Right. Thanks Tiejun > > Wei. >