All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Chen, Tiejun" <tiejun.chen@intel.com>
To: Wei Liu <wei.liu2@citrix.com>
Cc: kevin.tian@intel.com, ian.campbell@citrix.com,
	andrew.cooper3@citrix.com, tim@xen.org, xen-devel@lists.xen.org,
	stefano.stabellini@citrix.com, JBeulich@suse.com,
	yang.z.zhang@intel.com, Ian.Jackson@eu.citrix.com
Subject: Re: [RFC][v2][PATCH 01/14] tools: introduce some new parameters to set rdm policy
Date: Wed, 03 Jun 2015 09:35:16 +0800	[thread overview]
Message-ID: <556E59D4.6080804@intel.com> (raw)
In-Reply-To: <20150602155715.GT19403@zion.uk.xensource.com>

>> +=item B<rdm= "RDM_RESERVE_STRING" >
>
> Stray space after before and after "RDM_RESERVE_STRING".

Sure,

=item B<rdm="RDM_RESERVE_STRING">

>
>> +
>> +(HVM/x86 only) Specifies the information about Reserved Device Memory (RDM),
>> +which is necessary to enable robust device passthrough usage. One example of
>
> Delete "usage".

Okay.

>
>> +RDM is reported through ACPI Reserved Memory Region Reporting (RMRR)
>> +structure on x86 platform.
>> +
>> +B<RDM_RESERVE_STRING> has the form C<[KEY=VALUE,KEY=VALUE,...> where:
>> +
>> +=over 4
>> +
>> +=item B<KEY=VALUE>
>> +
>> +Possible B<KEY>s are:
>> +
>> +=over 4
>> +
>> +=item B<type="STRING">
>> +
>> +Currently we just have two types:
>> +
>> +"host" means all reserved device memory on this platform should be reserved
>> +in this VM's pfn space. This global RDM parameter allows user to specify
>
> PFN is Xen internal terminology. Do you mean "guest address space"? Note
> that the reader is system administrators who might not know / want to
> know Xen internals.

Sure.

>
>> +reserved regions explicitly. And using "host" to include all reserved regions
>> +reported on this platform which is good to handle hotplug scenario. In the
>> +future this parameter may be further extended to allow specifying random
>> +regions, e.g. even those belonging to another platform as a preparation
>
> Extending how? What's your envisaged syntax for those random regions?

We didn't go into details while discussing that design. Maybe we can do 
something like this,

rdm="type=host,reserve=strict,rdm_add=size[KMG][@offset[KMG]],size[KMG][@offset[KMG]],..."

> Should you want to reserve more, an array is more useful. Could you

Yeah.

> provide some examples?

But we may have alternative approach to this when I noticed some guys 
are trying to delivery some patches about setting rmrr region by xen 
commandline. So I also would like to check this likelihood when we can 
step forward.

>
>> +for live migration with passthrough devices.
>> +
>> +"none" means we have nothing to do all reserved regions and ignore all policies,
>> +so guest work as before.
>> +
>> +=over 4
>> +
>> +=item B<reserve="STRING">
>> +
>> +Conflict may be detected when reserving reserved device memory in gfn space.
>
> GFN is a Xen internal terminology. Maybe you should use "guest address
> space"?
>
> Nonetheless the terminology throughout this document should be
> consistent.

Sure, so I will do this,

s/pfn/guest address space/g

s/gfn/guest address space/g

>
>> +"strict" means an unsolved conflict leads to immediate VM crash, while
>> +"relaxed" allows VM moving forward with a warning message thrown out. "relaxed"
>> +is default.
>> +
>> +Note this may be overrided by another sub item, rdm_reserve, in pci device.
>> +
>
> "overridden by rdm_reserve option in PCI device configuration".

Okay.

>
>>   =item B<pci=[ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ... ]>
>>
>>   Specifies the host PCI devices to passthrough to this guest. Each B<PCI_SPEC_STRING>
>> @@ -707,6 +750,20 @@ dom0 without confirmation.  Please use with care.
>>   D0-D3hot power management states for the PCI device. False (0) by
>>   default.
>>
>> +=item B<rdm_reserv="STRING">
>> +
>> +(HVM/x86 only) Specifies the information about Reserved Device Memory (RDM),
>> +which is necessary to enable robust device passthrough usage. One example of
>
> Delete "usage".
>
>> +RDM is reported through ACPI Reserved Memory Region Reporting (RMRR)
>> +structure on x86 platform.
>> +
>> +Conflict may be detected when reserving reserved device memory in gfn space.
>> +"strict" means an unsolved conflict leads to immediate VM crash, while
>> +"relaxed" allows VM moving forward with a warning message thrown out. "strict"
>> +is default.
>> +
>
> Actually these two paragraphs are the same as before. You can just point
> readers to previous sections instead of copying them here.

So instead,

(HVM/x86 only) This is same as reserve option above but just specific
to a given device, and "strict" is default here.

>
>> +Note this would override global B<rdm> option.
>> +
>>   =back
>>
>>   =back
>> diff --git a/docs/misc/vtd.txt b/docs/misc/vtd.txt
>> index 9af0e99..7d63c47 100644
>> --- a/docs/misc/vtd.txt
>> +++ b/docs/misc/vtd.txt
>> @@ -111,6 +111,30 @@ in the config file:
>>   To override for a specific device:
>>   	pci = [ '01:00.0,msitranslate=0', '03:00.0' ]
>>
>> +RDM, 'reserved device memory', for PCI Device Passthrough
>> +---------------------------------------------------------
>> +
>> +There are some devices the BIOS controls, for e.g. USB devices to perform
>> +PS2 emulation. The regions of memory used for these devices are marked
>> +reserved in the e820 map. When we turn on DMA translation, DMA to those
>> +regions will fail. Hence BIOS uses RMRR to specify these regions along with
>> +devices that need to access these regions. OS is expected to setup
>> +identity mappings for these regions for these devices to access these regions.
>> +
>> +While creating a VM we should reserve them in advance, and avoid any conflicts.
>> +So we introduce user configurable parameters to specify RDM resource and
>> +according policies,
>> +
>> +To enable this globally, add "rdm" in the config file:
>> +
>> +    rdm = "type=host, reserve=relaxed"   (default policy is "relaxed")
>> +
>> +Or just for a specific device:
>> +
>> +    pci = [ '01:00.0,rdm_reserve=relaxed', '03:00.0,rdm_reserve=strict' ]
>> +
>> +For all the options available to RDM, see xl.cfg(5).
>> +
>>
>>   Caveat on Conventional PCI Device Passthrough
>>   ---------------------------------------------
>> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
>> index f0da7dc..d649ead 100644
>> --- a/tools/libxl/libxl_create.c
>> +++ b/tools/libxl/libxl_create.c
>> @@ -100,6 +100,12 @@ static int sched_params_valid(libxl__gc *gc,
>>       return 1;
>>   }
>>
>> +void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
>> +{
>> +    b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_NONE;

Based on our previous discussion, I will initial this firstly,

+libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [
+    (0, "none"),
+    (1, "host"),
+    ], init_val = "LIBXL_RDM_RESERVE_TYPE_NONE")
+

and then, I would remove this line since right now we just own two 
options, "none" or "host". And both they're fine.

>> +    b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_RELAXED;
>
> No, not like this. You set everything back to none and relaxed even if
> it is set before this point.
>
> It should be
>      if (xxx == DEFAULT_SENTINEL_VALUE)
>          xxx = THE_DEFAULT_YOU_WANT;
>
> Have a look at libxl__device_nic_setdefault etc to get an idea
> how it works. Don't hesitate to ask if I'm not clear enough.

But indeed, here we should set rdm.reserve as you said,

+void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
+{
+    if (b_info->rdm.reserve == LIBXL_RDM_RESERVE_FLAG_INVALID)
+        b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_RELAXED;
+}
+

>
>> +}
>> +
>>   int libxl__domain_build_info_setdefault(libxl__gc *gc,
>>                                           libxl_domain_build_info *b_info)
>>   {
>> @@ -410,6 +416,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>>                      libxl_domain_type_to_string(b_info->type));
>>           return ERROR_INVAL;
>>       }
>> +
>> +    libxl__rdm_setdefault(gc, b_info);
>>       return 0;
>>   }
>>
>> @@ -1439,6 +1447,11 @@ static void domcreate_attach_pci(libxl__egc *egc, libxl__multidev *multidev,
>>       }
>>
>>       for (i = 0; i < d_config->num_pcidevs; i++) {
>> +        /*
>> +         * If the rdm global policy is 'force' we should override each device.
>> +         */
>
> "strict" not "force"

Right.

Thanks
Tiejun

>
> Wei.
>

  reply	other threads:[~2015-06-03  1:35 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-22  9:35 [RFC][v2][PATCH 00/14] Fix RMRR Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 01/14] tools: introduce some new parameters to set rdm policy Tiejun Chen
2015-06-02 15:57   ` Wei Liu
2015-06-03  1:35     ` Chen, Tiejun [this message]
2015-06-07 11:06       ` Wei Liu
2015-06-08  1:42         ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 02/14] introduce XENMEM_reserved_device_memory_map Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 03/14] tools/libxc: Expose new hypercall xc_reserved_device_memory_map Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 04/14] tools/libxl: detect and avoid conflicts with RDM Tiejun Chen
2015-06-02 16:29   ` Wei Liu
2015-06-03  2:25     ` Chen, Tiejun
2015-06-07 11:20       ` Wei Liu
2015-06-08  2:16         ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 05/14] xen/x86/p2m: introduce set_identity_p2m_entry Tiejun Chen
2015-05-28 12:27   ` Jan Beulich
2015-05-29  1:19     ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 06/14] xen:vtd: create RMRR mapping Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 07/14] xen/passthrough: extend hypercall to support rdm reservation policy Tiejun Chen
2015-05-22 10:33   ` Julien Grall
2015-05-25  2:09     ` Chen, Tiejun
2015-05-25 10:02       ` Julien Grall
2015-05-25 10:50         ` Chen, Tiejun
2015-05-25 11:42           ` Julien Grall
2015-05-26  0:42             ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 08/14] tools: extend xc_assign_device() " Tiejun Chen
2015-06-02 16:36   ` Wei Liu
2015-06-03  2:58     ` Chen, Tiejun
2015-06-07 11:27       ` Wei Liu
2015-06-09  5:42         ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 09/14] xen: enable XENMEM_memory_map in hvm Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 10/14] tools: extend XENMEM_set_memory_map Tiejun Chen
2015-05-22 10:25   ` Julien Grall
2015-05-25  2:00     ` Chen, Tiejun
2015-06-02 16:42   ` Wei Liu
2015-06-03  3:06     ` Chen, Tiejun
2015-05-22  9:35 ` [RFC][v2][PATCH 11/14] hvmloader: get guest memory map into memory_map[] Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 12/14] hvmloader/pci: skip reserved ranges Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 13/14] hvmloader/e820: construct guest e820 table Tiejun Chen
2015-05-22  9:35 ` [RFC][v2][PATCH 14/14] xen/vtd: enable USB device assignment Tiejun Chen
2015-05-22  9:46 ` [RFC][v2][PATCH 00/14] Fix RMRR Jan Beulich
2015-05-28  5:48   ` Chen, Tiejun
2015-05-28  7:55     ` Jan Beulich
2015-05-29  7:58       ` Chen, Tiejun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=556E59D4.6080804@intel.com \
    --to=tiejun.chen@intel.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=kevin.tian@intel.com \
    --cc=stefano.stabellini@citrix.com \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.