From: "Chen, Tiejun" <tiejun.chen@intel.com>
To: Jan Beulich <JBeulich@suse.com>,
ian.campbell@citrix.com, stefano.stabellini@citrix.com,
wei.liu2@citrix.com, Ian.Jackson@eu.citrix.com
Cc: kevin.tian@intel.com, andrew.cooper3@citrix.com, tim@xen.org,
xen-devel@lists.xen.org, yang.z.zhang@intel.com
Subject: Re: [RFC][PATCH 04/13] tools/libxl: detect and avoid conflicts with RDM
Date: Thu, 07 May 2015 10:22:20 +0800 [thread overview]
Message-ID: <554ACC5C.2040300@intel.com> (raw)
In-Reply-To: <554A509402000078000773F6@mail.emea.novell.com>
On 2015/5/6 23:34, Jan Beulich wrote:
>>>> On 06.05.15 at 17:00, <tiejun.chen@intel.com> wrote:
>> On 2015/4/20 19:13, Jan Beulich wrote:
>>>>>> On 10.04.15 at 11:21, <tiejun.chen@intel.com> wrote:
>>>> --- a/tools/libxc/xc_domain.c
>>>> +++ b/tools/libxc/xc_domain.c
>>>> @@ -1665,6 +1665,46 @@ int xc_assign_device(
>>>> return do_domctl(xch, &domctl);
>>>> }
>>>>
>>>> +struct xen_reserved_device_memory
>>>> +*xc_device_get_rdm(xc_interface *xch,
>>>> + uint32_t flag,
>>>> + uint16_t seg,
>>>> + uint8_t bus,
>>>> + uint8_t devfn,
>>>> + unsigned int *nr_entries)
>>>> +{
>>>
>>> So what's the point of having both this new function and
>>> xc_reserved_device_memory_map()? Is the latter useful for
>>> anything besides the purpose here?
>>
>> I just hope xc_reserved_device_memory_map() is a standard interface to
>> call that XENMEM_reserved_device_memory_map, but xc_device_get_rdm() can
>> handle some errors in current case.
>>
>> I think you are hinting we just need one, right?
>
> Correct. But remember - I'm not a maintainer of this code, so
But this may be a little complex with one...
> maintainers may be of different opinion.
Anyway, let me ask our tools maintainers.
Campbell, Jackson, Wei and Stefano,
What about your concern to this?
>
>>>> + struct xen_reserved_device_memory *xrdm = NULL;
>>>> + int rc = xc_reserved_device_memory_map(xch, flag, seg, bus, devfn, xrdm,
>>>> + nr_entries);
>>>> +
>>>> + if ( rc < 0 )
>>>> + {
>>>> + if ( errno == ENOBUFS )
>>>> + {
>>>> + if ( (xrdm = malloc(*nr_entries *
>>>> + sizeof(xen_reserved_device_memory_t))) == NULL )
>>>> + {
>>>> + PERROR("Could not allocate memory.");
>>>
>>> Now that's exactly the kind of error message that makes no sense:
>>> As errno will already cause PERROR() to print something along the
>>> lines of the message you provide here, you're just creating
>>> redundancy. Indicating the purpose of the allocation, otoh, would
>>> add helpful context for the one inspecting the resulting log.
>>
>> What about this?
>>
>> PERROR("Could not allocate memory buffers to store reserved device
>> memory entries.");
>
> You kind of go from one extreme to the other - the message
> doesn't need to be overly long, but it should be distinct from
> all other messages (so that when seen one can identify what
> went wrong).
I originally refer to some existing examples like this,
int
xc_core_arch_memory_map_get(xc_interface *xch, struct
xc_core_arch_context *unused,
xc_dominfo_t *info, shared_info_any_t
*live_shinfo,
xc_core_memory_map_t **mapp,
unsigned int *nr_entries)
{
...
map = malloc(sizeof(*map));
if ( map == NULL )
{
PERROR("Could not allocate memory");
return -1;
}
Maybe this is wrong to my case. Could I change this?
PERROR("Could not allocate memory for XENMEM_reserved_device_memory_map
hypercall");
Or just give me your line.
>
>>>> @@ -302,8 +300,11 @@ static int setup_guest(xc_interface *xch,
>>>>
>>>> for ( i = 0; i < nr_pages; i++ )
>>>> page_array[i] = i;
>>>> - for ( i = mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
>>>> - page_array[i] += mmio_size >> PAGE_SHIFT;
>>>> + /*
>>>> + * This condition 'lowmem_end <= mmio_start' is always true.
>>>> + */
>>>
>>> For one I think you mean "The", not "This", as there's no such
>>> condition around here. And then - why? DYM "is supposed to
>>> always be true"? In which case you may want to check...
>>
>> I always do this inside libxl__build_hvm() but before setup_guest(),
>>
>> + if (args.lowmem_size > mmio_start)
>> + args.lowmem_size = mmio_start;
>>
>> And plus, we also another policy to rdm,
>>
>> #1. Above a predefined boundary (default 2G)
>> - move lowmem_end below reserved region to solve conflict;
>>
>> This means there's such a likelihood of args.lowmem_size < mmio_start)
>> as well.
>>
>> So here I'm saying the condition is always true.
>
> Okay, but again - if this is relevant to the following code, an
> assertion or alike may still be warranted.
Yes I should add 'assert()' here.
>
>>> and hence don't have the final say on stylistic issues, I don't see
>>> why the above couldn't be expressed with a single return statement.
>>
>> Are you saying something like this? Note this was showed by yourself
>> long time ago.
>
> I know, and hence I was puzzled to still see you use the more
> convoluted form.
>
>> static bool check_mmio_hole_conflict(uint64_t start, uint64_t memsize,
>> uint64_t mmio_start, uint64_t mmio_size)
>> {
>> return start + memsize > mmio_start && start < mmio_start + mmio_size;
>> }
>>
>> But I don't think this really can't work out our case.
>
> It's equivalent to the original you had, so I don't see what you
> mean with "this really can't work out our case".
>
Let me make this point clear.
The original implementation,
+static int check_rdm_hole(uint64_t start, uint64_t memsize,
+ uint64_t rdm_start, uint64_t rdm_size)
+{
+ if (start + memsize <= rdm_start || start >= rdm_start + rdm_size)
+ return 0;
+ else
+ return 1;
+}
means it returns 'false' in two cases:
#1. end = start + memsize; end <= rdm_start;
This region [start, end] is below of rdm entry.
#2. rdm_end = rdm_start + rdm_size; stat >= rdm_end;
This region [start, end] is above of rdm entry.
So others conditions should indicate that rdm entry is overlapping with
this region. Actually this has three cases:
#1. This region just conflicts with the first half of rdm entry;
#2. This region just conflicts with the second half of rdm entry;
#3. This whole region falls inside of rdm entry;
Then it should return 'true', right?
But with this single line,
return start + memsize > rdm_start && start < rdm_start + rdm_size;
=>
return end > rdm_start && start < rdm_end;
This just guarantee it return 'true' *only* if #3 above occurs.
>>>> +int libxl__domain_device_check_rdm(libxl__gc *gc,
>>>> + libxl_domain_config *d_config,
>>>> + uint64_t rdm_mem_guard,
>>>> + struct xc_hvm_build_args *args)
>>>> +{
>>>> + int i, j, conflict;
>>>> + libxl_ctx *ctx = libxl__gc_owner(gc);
>>>> + struct xen_reserved_device_memory *xrdm = NULL;
>>>> + unsigned int nr_all_rdms = 0;
>>>> + uint64_t rdm_start, rdm_size, highmem_end = (1ULL << 32);
>>>> + uint32_t type = d_config->b_info.rdm.type;
>>>> + uint16_t seg;
>>>> + uint8_t bus, devfn;
>>>> +
>>>> + /* Might not to expose rdm. */
>>>> + if ((type == LIBXL_RDM_RESERVE_TYPE_NONE) && !d_config->num_pcidevs)
>>>> + return 0;
>>>> +
>>>> + /* Collect all rdm info if exist. */
>>>> + xrdm = xc_device_get_rdm(ctx->xch, LIBXL_RDM_RESERVE_TYPE_HOST,
>>>> + 0, 0, 0, &nr_all_rdms);
>>>
>>> What meaning has passing a libxl private value to a libxc function?
>>
>> We intend to collect all rdm entries info in advance and then we can
>> construct d_config->rdms based on our policies as follows. Because we
>> need to first allocate d_config->rdms properly to store rdms, but in
>> some cases we don't know how many buffers are enough. For example, we
>> don't have that global flag but with multiple pci devices. And even a
>> shared entry worsen this situation.
>>
>> So here, we set that flag as LIBXL_RDM_RESERVE_TYPE_HOST but without any
>> SBDF to grab all rdms.
>
> I'm afraid you didn't get my point: Values passed to libxc should be
Sorry for this misunderstanding.
> known to libxc. Values privately defined by libxl for its own purposes
> aren't known to libxc, and hence shouldn't be passed to libxc
> functions.
I think we should set this with 'PCI_DEV_RDM_ALL' since,
struct xen_reserved_device_memory_map {
/* IN */
/* Currently just one bit to indicate checkng all Reserved Device
Memory. */
#define PCI_DEV_RDM_ALL 0x1
>
>>>> + * 'try' policy is specified, and we also mark this as INVALID not to expose
>>>> + * this entry to hvmloader.
>>>
>>> What is "this" in "... also mark this as ..."? Certainly neither the conflict
>>> nor the warning.
>>
>> Sorry, this is my fault.
>>
>> * If a conflict is detected on a given RMRR entry, an error will be
>> * returned if 'strict' policy is specified. Or conflict is treated as a
>> * warning if 'relaxed' policy is specified, and we also mark this as
>> * INVALID not to expose this entry to hvmloader.
>
> The same "this" still doesn't have anything reasonable it references. I
> think you mean "the entry" (in which case the subsequent "this entry"
> could become just "it" afaict). But (not being a native speaker) the
> grammar of the second half of the sentence looks odd (and hence
> potentially confusing) to me anyway (i.e. even with the previous
Sure, we need to make this better and clear.
> issue fixed).
* If a conflict is detected on a given RMRR entry, an error will be
* returned if 'strict' policy is specified. Instead, if 'relaxed'
policy
* specified, this conflict is treated just as a warning, but we
mark this
* RMRR entry as INVALID to indicate that this entry shouldn't be
exposed
* to hvmloader.
I hope this can help us understand what we do.
>
>>>> + *
>>>> + * Firstly we should check the case of rdm < 4G because we may need to
>>>> + * expand highmem_end.
>>>> + */
>>>> + for (i = 0; i < d_config->num_rdms; i++) {
>>>> + rdm_start = d_config->rdms[i].start;
>>>> + rdm_size = d_config->rdms[i].size;
>>>> + conflict = check_rdm_hole(0, args->lowmem_size, rdm_start, rdm_size);
>>>> +
>>>> + if (!conflict)
>>>> + continue;
>>>> +
>>>> + /*
>>>> + * Just check if RDM > our memory boundary
>>>> + */
>>>> + if (d_config->rdms[i].start > rdm_mem_guard) {
>>>> + /*
>>>> + * We will move downwards lowmem_end so we have to expand
>>>> + * highmem_end.
>>>> + */
>>>> + highmem_end += (args->lowmem_size - rdm_start);
>>>> + /* Now move downwards lowmem_end. */
>>>> + args->lowmem_size = rdm_start;
>>>
>>> Considering that the action here doesn't depend on the specific
>>> ->rdms[] slot being looked at, I don't see why the loop needs to
>>
>> I'm not sure if I understand what you mean.
>>
>> All rdm entries are organized disorderly in d_config->rdms, so we should
>> traverse all entries to make sure args->lowmem_size is below all rdms'
>> start address.
>
> I think I see what confused me: in the if() condition you reference
> d_config->rdms[i].start, yet the body of the if() has no reference
> to d_config->rdms[i] at all. If the if() used rdm_start it would
> become obvious that this is being latched at the beginning of the
Indeed, I really should use rdm_start here.
> body (which is what I overlooked, assuming the variable's value
> to have got set prior to the loop), and hence the body is not loop
> invariant.
>
So just replace d_config->rdms[i].start with rdm_start like this,
/*
* Just check if RDM > our memory boundary
*/
if (rdm_start > rdm_mem_guard) {
/*
* We will move downwards lowmem_end so we have to expand
* highmem_end.
*/
highmem_end += (args->lowmem_size - rdm_start);
/* Now move downwards lowmem_end. */
args->lowmem_size = rdm_start;
}
}
Thanks
Tiejun
next prev parent reply other threads:[~2015-05-07 2:22 UTC|newest]
Thread overview: 125+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-10 9:21 [RFC][PATCH 00/13] Fix RMRR Tiejun Chen
2015-04-10 9:21 ` [RFC][PATCH 01/13] tools: introduce some new parameters to set rdm policy Tiejun Chen
2015-05-08 13:04 ` Wei Liu
2015-05-11 5:35 ` Chen, Tiejun
2015-05-11 14:54 ` Wei Liu
2015-05-15 1:52 ` Chen, Tiejun
2015-05-18 1:06 ` Chen, Tiejun
2015-05-18 19:17 ` Wei Liu
2015-05-19 3:16 ` Chen, Tiejun
2015-05-19 9:42 ` Wei Liu
2015-05-19 10:50 ` Chen, Tiejun
2015-05-19 11:00 ` Wei Liu
2015-05-20 5:27 ` Chen, Tiejun
2015-05-20 8:36 ` Wei Liu
2015-05-20 8:51 ` Chen, Tiejun
2015-05-20 9:07 ` Wei Liu
2015-04-10 9:21 ` [RFC][PATCH 02/13] introduce XENMEM_reserved_device_memory_map Tiejun Chen
2015-04-16 14:59 ` Tim Deegan
2015-04-16 15:10 ` Jan Beulich
2015-04-16 15:24 ` Tim Deegan
2015-04-16 15:40 ` Tian, Kevin
2015-04-23 12:32 ` Chen, Tiejun
2015-04-23 12:59 ` Jan Beulich
2015-04-24 1:17 ` Chen, Tiejun
2015-04-24 7:21 ` Jan Beulich
2015-04-10 9:21 ` [RFC][PATCH 03/13] tools/libxc: Expose new hypercall xc_reserved_device_memory_map Tiejun Chen
2015-05-08 13:07 ` Wei Liu
2015-05-11 5:36 ` Chen, Tiejun
2015-05-11 9:50 ` Wei Liu
2015-04-10 9:21 ` [RFC][PATCH 04/13] tools/libxl: detect and avoid conflicts with RDM Tiejun Chen
2015-04-15 13:10 ` Ian Jackson
2015-04-15 18:22 ` Tian, Kevin
2015-04-23 12:31 ` Chen, Tiejun
2015-04-20 11:13 ` Jan Beulich
2015-05-06 15:00 ` Chen, Tiejun
2015-05-06 15:34 ` Jan Beulich
2015-05-07 2:22 ` Chen, Tiejun [this message]
2015-05-07 6:04 ` Jan Beulich
2015-05-08 1:14 ` Chen, Tiejun
2015-05-08 1:24 ` Chen, Tiejun
2015-05-08 15:13 ` Wei Liu
2015-05-11 6:06 ` Chen, Tiejun
2015-05-08 14:43 ` Wei Liu
2015-05-11 8:09 ` Chen, Tiejun
2015-05-11 11:32 ` Wei Liu
2015-05-14 8:27 ` Chen, Tiejun
2015-05-18 1:06 ` Chen, Tiejun
2015-05-18 20:00 ` Wei Liu
2015-05-19 1:32 ` Tian, Kevin
2015-05-19 10:22 ` Wei Liu
2015-05-19 6:47 ` Chen, Tiejun
2015-04-10 9:21 ` [RFC][PATCH 05/13] xen/x86/p2m: introduce set_identity_p2m_entry Tiejun Chen
2015-04-16 15:05 ` Tim Deegan
2015-04-23 12:33 ` Chen, Tiejun
2015-04-10 9:21 ` [RFC][PATCH 06/13] xen:vtd: create RMRR mapping Tiejun Chen
2015-04-16 15:16 ` Tim Deegan
2015-04-16 15:50 ` Tian, Kevin
2015-04-10 9:21 ` [RFC][PATCH 07/13] xen/passthrough: extend hypercall to support rdm reservation policy Tiejun Chen
2015-04-16 15:40 ` Tim Deegan
2015-04-23 12:32 ` Chen, Tiejun
2015-04-23 13:05 ` Tim Deegan
2015-04-23 13:59 ` Jan Beulich
2015-04-23 14:26 ` Tim Deegan
2015-05-04 8:15 ` Tian, Kevin
2015-04-20 13:36 ` Jan Beulich
2015-05-11 8:37 ` Chen, Tiejun
2015-05-08 16:07 ` Julien Grall
2015-05-11 8:42 ` Chen, Tiejun
2015-05-11 9:51 ` Julien Grall
2015-05-11 10:57 ` Jan Beulich
2015-05-14 5:48 ` Chen, Tiejun
2015-05-14 20:13 ` Jan Beulich
2015-05-14 5:47 ` Chen, Tiejun
2015-05-14 10:19 ` Julien Grall
2015-04-10 9:21 ` [RFC][PATCH 08/13] tools: extend xc_assign_device() " Tiejun Chen
2015-04-20 13:39 ` Jan Beulich
2015-05-11 9:45 ` Chen, Tiejun
2015-05-11 10:53 ` Jan Beulich
2015-05-14 7:04 ` Chen, Tiejun
2015-04-10 9:22 ` [RFC][PATCH 09/13] xen: enable XENMEM_set_memory_map in hvm Tiejun Chen
2015-04-16 15:42 ` Tim Deegan
2015-04-20 13:46 ` Jan Beulich
2015-05-15 2:33 ` Chen, Tiejun
2015-05-15 6:12 ` Jan Beulich
2015-05-15 6:24 ` Chen, Tiejun
2015-05-15 6:35 ` Jan Beulich
2015-05-15 6:59 ` Chen, Tiejun
2015-04-10 9:22 ` [RFC][PATCH 10/13] tools: extend XENMEM_set_memory_map Tiejun Chen
2015-04-10 10:01 ` Wei Liu
2015-04-13 2:09 ` Chen, Tiejun
2015-04-13 11:02 ` Wei Liu
2015-04-14 0:42 ` Chen, Tiejun
2015-05-05 9:32 ` Wei Liu
2015-04-20 13:51 ` Jan Beulich
2015-05-15 2:57 ` Chen, Tiejun
2015-05-15 6:16 ` Jan Beulich
2015-05-15 7:09 ` Chen, Tiejun
2015-05-15 7:32 ` Jan Beulich
2015-05-15 7:51 ` Chen, Tiejun
2015-04-10 9:22 ` [RFC][PATCH 11/13] hvmloader: get guest memory map into memory_map[] Tiejun Chen
2015-04-20 13:57 ` Jan Beulich
2015-05-15 3:10 ` Chen, Tiejun
2015-04-10 9:22 ` [RFC][PATCH 12/13] hvmloader/pci: skip reserved ranges Tiejun Chen
2015-04-20 14:21 ` Jan Beulich
2015-05-15 3:18 ` Chen, Tiejun
2015-05-15 6:19 ` Jan Beulich
2015-05-15 7:34 ` Chen, Tiejun
2015-05-15 7:44 ` Jan Beulich
2015-05-15 8:16 ` Chen, Tiejun
2015-05-15 8:31 ` Jan Beulich
2015-05-15 9:21 ` Chen, Tiejun
2015-05-15 9:32 ` Jan Beulich
2015-04-10 9:22 ` [RFC][PATCH 13/13] hvmloader/e820: construct guest e820 table Tiejun Chen
2015-04-20 14:29 ` Jan Beulich
2015-05-15 6:11 ` Chen, Tiejun
2015-05-15 6:25 ` Jan Beulich
2015-05-15 6:39 ` Chen, Tiejun
2015-05-15 6:56 ` Jan Beulich
2015-05-15 7:11 ` Chen, Tiejun
2015-05-15 7:34 ` Jan Beulich
2015-05-15 8:00 ` Chen, Tiejun
2015-05-15 8:12 ` Jan Beulich
2015-05-15 8:47 ` Chen, Tiejun
2015-05-15 8:54 ` Jan Beulich
2015-05-15 9:18 ` Chen, Tiejun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=554ACC5C.2040300@intel.com \
--to=tiejun.chen@intel.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=ian.campbell@citrix.com \
--cc=kevin.tian@intel.com \
--cc=stefano.stabellini@citrix.com \
--cc=tim@xen.org \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
--cc=yang.z.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).