xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Chen, Tiejun" <tiejun.chen@intel.com>
To: Wei Liu <wei.liu2@citrix.com>
Cc: kevin.tian@intel.com, ian.campbell@citrix.com,
	andrew.cooper3@citrix.com, tim@xen.org, xen-devel@lists.xen.org,
	stefano.stabellini@citrix.com, JBeulich@suse.com,
	yang.z.zhang@intel.com, Ian.Jackson@eu.citrix.com
Subject: Re: [RFC][v2][PATCH 01/14] tools: introduce some new parameters to set rdm policy
Date: Wed, 03 Jun 2015 09:35:16 +0800	[thread overview]
Message-ID: <556E59D4.6080804@intel.com> (raw)
In-Reply-To: <20150602155715.GT19403@zion.uk.xensource.com>

>> +=item B<rdm= "RDM_RESERVE_STRING" >
>
> Stray space after before and after "RDM_RESERVE_STRING".

Sure,

=item B<rdm="RDM_RESERVE_STRING">

>
>> +
>> +(HVM/x86 only) Specifies the information about Reserved Device Memory (RDM),
>> +which is necessary to enable robust device passthrough usage. One example of
>
> Delete "usage".

Okay.

>
>> +RDM is reported through ACPI Reserved Memory Region Reporting (RMRR)
>> +structure on x86 platform.
>> +
>> +B<RDM_RESERVE_STRING> has the form C<[KEY=VALUE,KEY=VALUE,...> where:
>> +
>> +=over 4
>> +
>> +=item B<KEY=VALUE>
>> +
>> +Possible B<KEY>s are:
>> +
>> +=over 4
>> +
>> +=item B<type="STRING">
>> +
>> +Currently we just have two types:
>> +
>> +"host" means all reserved device memory on this platform should be reserved
>> +in this VM's pfn space. This global RDM parameter allows user to specify
>
> PFN is Xen internal terminology. Do you mean "guest address space"? Note
> that the reader is system administrators who might not know / want to
> know Xen internals.

Sure.

>
>> +reserved regions explicitly. And using "host" to include all reserved regions
>> +reported on this platform which is good to handle hotplug scenario. In the
>> +future this parameter may be further extended to allow specifying random
>> +regions, e.g. even those belonging to another platform as a preparation
>
> Extending how? What's your envisaged syntax for those random regions?

We didn't go into details while discussing that design. Maybe we can do 
something like this,

rdm="type=host,reserve=strict,rdm_add=size[KMG][@offset[KMG]],size[KMG][@offset[KMG]],..."

> Should you want to reserve more, an array is more useful. Could you

Yeah.

> provide some examples?

But we may have alternative approach to this when I noticed some guys 
are trying to delivery some patches about setting rmrr region by xen 
commandline. So I also would like to check this likelihood when we can 
step forward.

>
>> +for live migration with passthrough devices.
>> +
>> +"none" means we have nothing to do all reserved regions and ignore all policies,
>> +so guest work as before.
>> +
>> +=over 4
>> +
>> +=item B<reserve="STRING">
>> +
>> +Conflict may be detected when reserving reserved device memory in gfn space.
>
> GFN is a Xen internal terminology. Maybe you should use "guest address
> space"?
>
> Nonetheless the terminology throughout this document should be
> consistent.

Sure, so I will do this,

s/pfn/guest address space/g

s/gfn/guest address space/g

>
>> +"strict" means an unsolved conflict leads to immediate VM crash, while
>> +"relaxed" allows VM moving forward with a warning message thrown out. "relaxed"
>> +is default.
>> +
>> +Note this may be overrided by another sub item, rdm_reserve, in pci device.
>> +
>
> "overridden by rdm_reserve option in PCI device configuration".

Okay.

>
>>   =item B<pci=[ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ... ]>
>>
>>   Specifies the host PCI devices to passthrough to this guest. Each B<PCI_SPEC_STRING>
>> @@ -707,6 +750,20 @@ dom0 without confirmation.  Please use with care.
>>   D0-D3hot power management states for the PCI device. False (0) by
>>   default.
>>
>> +=item B<rdm_reserv="STRING">
>> +
>> +(HVM/x86 only) Specifies the information about Reserved Device Memory (RDM),
>> +which is necessary to enable robust device passthrough usage. One example of
>
> Delete "usage".
>
>> +RDM is reported through ACPI Reserved Memory Region Reporting (RMRR)
>> +structure on x86 platform.
>> +
>> +Conflict may be detected when reserving reserved device memory in gfn space.
>> +"strict" means an unsolved conflict leads to immediate VM crash, while
>> +"relaxed" allows VM moving forward with a warning message thrown out. "strict"
>> +is default.
>> +
>
> Actually these two paragraphs are the same as before. You can just point
> readers to previous sections instead of copying them here.

So instead,

(HVM/x86 only) This is same as reserve option above but just specific
to a given device, and "strict" is default here.

>
>> +Note this would override global B<rdm> option.
>> +
>>   =back
>>
>>   =back
>> diff --git a/docs/misc/vtd.txt b/docs/misc/vtd.txt
>> index 9af0e99..7d63c47 100644
>> --- a/docs/misc/vtd.txt
>> +++ b/docs/misc/vtd.txt
>> @@ -111,6 +111,30 @@ in the config file:
>>   To override for a specific device:
>>   	pci = [ '01:00.0,msitranslate=0', '03:00.0' ]
>>
>> +RDM, 'reserved device memory', for PCI Device Passthrough
>> +---------------------------------------------------------
>> +
>> +There are some devices the BIOS controls, for e.g. USB devices to perform
>> +PS2 emulation. The regions of memory used for these devices are marked
>> +reserved in the e820 map. When we turn on DMA translation, DMA to those
>> +regions will fail. Hence BIOS uses RMRR to specify these regions along with
>> +devices that need to access these regions. OS is expected to setup
>> +identity mappings for these regions for these devices to access these regions.
>> +
>> +While creating a VM we should reserve them in advance, and avoid any conflicts.
>> +So we introduce user configurable parameters to specify RDM resource and
>> +according policies,
>> +
>> +To enable this globally, add "rdm" in the config file:
>> +
>> +    rdm = "type=host, reserve=relaxed"   (default policy is "relaxed")
>> +
>> +Or just for a specific device:
>> +
>> +    pci = [ '01:00.0,rdm_reserve=relaxed', '03:00.0,rdm_reserve=strict' ]
>> +
>> +For all the options available to RDM, see xl.cfg(5).
>> +
>>
>>   Caveat on Conventional PCI Device Passthrough
>>   ---------------------------------------------
>> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
>> index f0da7dc..d649ead 100644
>> --- a/tools/libxl/libxl_create.c
>> +++ b/tools/libxl/libxl_create.c
>> @@ -100,6 +100,12 @@ static int sched_params_valid(libxl__gc *gc,
>>       return 1;
>>   }
>>
>> +void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
>> +{
>> +    b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_NONE;

Based on our previous discussion, I will initial this firstly,

+libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [
+    (0, "none"),
+    (1, "host"),
+    ], init_val = "LIBXL_RDM_RESERVE_TYPE_NONE")
+

and then, I would remove this line since right now we just own two 
options, "none" or "host". And both they're fine.

>> +    b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_RELAXED;
>
> No, not like this. You set everything back to none and relaxed even if
> it is set before this point.
>
> It should be
>      if (xxx == DEFAULT_SENTINEL_VALUE)
>          xxx = THE_DEFAULT_YOU_WANT;
>
> Have a look at libxl__device_nic_setdefault etc to get an idea
> how it works. Don't hesitate to ask if I'm not clear enough.

But indeed, here we should set rdm.reserve as you said,

+void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
+{
+    if (b_info->rdm.reserve == LIBXL_RDM_RESERVE_FLAG_INVALID)
+        b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_RELAXED;
+}
+

>
>> +}
>> +
>>   int libxl__domain_build_info_setdefault(libxl__gc *gc,
>>                                           libxl_domain_build_info *b_info)
>>   {
>> @@ -410,6 +416,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>>                      libxl_domain_type_to_string(b_info->type));
>>           return ERROR_INVAL;
>>       }
>> +
>> +    libxl__rdm_setdefault(gc, b_info);
>>       return 0;
>>   }
>>
>> @@ -1439,6 +1447,11 @@ static void domcreate_attach_pci(libxl__egc *egc, libxl__multidev *multidev,
>>       }
>>
>>       for (i = 0; i < d_config->num_pcidevs; i++) {
>> +        /*
>> +         * If the rdm global policy is 'force' we should override each device.
>> +         */
>
> "strict" not "force"

Right.

Thanks
Tiejun

>
> Wei.
>

  reply	other threads:[~2015-06-03  1:35 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-22  9:35 [RFC][v2][PATCH 00/14] Fix RMRR Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 01/14] tools: introduce some new parameters to set rdm policy Tiejun Chen
2015-06-02 15:57   ` Wei Liu
2015-06-03  1:35     ` Chen, Tiejun [this message]
2015-06-07 11:06       ` Wei Liu
2015-06-08  1:42         ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 02/14] introduce XENMEM_reserved_device_memory_map Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 03/14] tools/libxc: Expose new hypercall xc_reserved_device_memory_map Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 04/14] tools/libxl: detect and avoid conflicts with RDM Tiejun Chen
2015-06-02 16:29   ` Wei Liu
2015-06-03  2:25     ` Chen, Tiejun
2015-06-07 11:20       ` Wei Liu
2015-06-08  2:16         ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 05/14] xen/x86/p2m: introduce set_identity_p2m_entry Tiejun Chen
2015-05-28 12:27   ` Jan Beulich
2015-05-29  1:19     ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 06/14] xen:vtd: create RMRR mapping Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 07/14] xen/passthrough: extend hypercall to support rdm reservation policy Tiejun Chen
2015-05-22 10:33   ` Julien Grall
2015-05-25  2:09     ` Chen, Tiejun
2015-05-25 10:02       ` Julien Grall
2015-05-25 10:50         ` Chen, Tiejun
2015-05-25 11:42           ` Julien Grall
2015-05-26  0:42             ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 08/14] tools: extend xc_assign_device() " Tiejun Chen
2015-06-02 16:36   ` Wei Liu
2015-06-03  2:58     ` Chen, Tiejun
2015-06-07 11:27       ` Wei Liu
2015-06-09  5:42         ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 09/14] xen: enable XENMEM_memory_map in hvm Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 10/14] tools: extend XENMEM_set_memory_map Tiejun Chen
2015-05-22 10:25   ` Julien Grall
2015-05-25  2:00     ` Chen, Tiejun
2015-06-02 16:42   ` Wei Liu
2015-06-03  3:06     ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 11/14] hvmloader: get guest memory map into memory_map[] Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 12/14] hvmloader/pci: skip reserved ranges Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 13/14] hvmloader/e820: construct guest e820 table Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 14/14] xen/vtd: enable USB device assignment Tiejun Chen
2015-05-22  9:46 ` [RFC][v2][PATCH 00/14] Fix RMRR Jan Beulich
2015-05-28  5:48   ` Chen, Tiejun
2015-05-28  7:55     ` Jan Beulich
2015-05-29  7:58       ` Chen, Tiejun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=556E59D4.6080804@intel.com \
    --to=tiejun.chen@intel.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=kevin.tian@intel.com \
    --cc=stefano.stabellini@citrix.com \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).