From: Jennifer Herbert <Jennifer.Herbert@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
PaulDurrant <paul.durrant@citrix.com>,
Wei Liu <wei.liu2@citrix.com>,
Ian Jackson <ian.jackson@eu.citrix.com>,
Xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2] dm_op: Add xendevicemodel_modified_memory_bulk.
Date: Wed, 22 Mar 2017 19:55:42 +0000 [thread overview]
Message-ID: <58D2D6BE.7000705@citrix.com> (raw)
In-Reply-To: <58D261270200007800146204@prv-mh.provo.novell.com>
On 22/03/17 10:33, Jan Beulich wrote:
>>>> On 21.03.17 at 14:59, <jennifer.herbert@citrix.com> wrote:
>> This version of this patch removes the need for the 'offset' parameter,
>> by instead reducing nr_extents, and working backwards from the end of
>> the array.
>>
>> This patch also removes the need to ever write back the passed array of
>> extents to the guest, but instead putting its partial progress in a
>> 'pfns_processed' parameter, which replaces the old 'offset' parameter.
>> As with the 'offest' parameter, 'pfns_processed' should be set to 0 on
>> calling.
> So how is the need for 'pfns_processed' better than the prior need
> for 'offset'? I continue to not see why you'd need any extra field, as
> you can write back to 'first_pfn' and 'nr' just like the old code did. (But
> note that I'm not outright objecting to this solution, I'd first like to
> understand why it was chosen.)
>
With v1 of the patch, on continuation, two buffers get written back to
guest, one
to update 'offset', and one to update first_pfn (in the array). Although
I can't think of an example
where this would be a problem, I think that semi-randomly altering items
in the passed
array may not be expected, and if that data was re-used for something,
could cause a bug.
There is precedence for changing the op buffer, and using
pfns_processed means we never have
to change this array, which I think is much cleaner, intuitive, and
halving the copy backs.
I considered your suggestion of modifying the handle array, but I think
this would make the code much harder to follow, and still require data
written back to the guest - just in a different place. I thought just
reducing the nr_extents a cleaner solution.
I can't see the problem with having an argument for continuation - the
xen_dm_op will remain the same size, if we use the argument space or
not. If its just about how the API looks, I find this highly subjective
- its no secret that continuations happen, and data gets stashed in
these parameters - and so i see no reason why having a dedicated
parameter for this is a problem.
If we want to hide this continuation information, we could define:
struct xen_dm_op_modified_memory {
/* IN - number of extents. */
uint32_t nr_extents;
}
struct xen_dm_op_modified_memory modified_memory_expanded {
struct xen_dm_op_modified_memory modified_memory;
uint32_t pfns_processed
}
and include both structures in the xen_dm_op union.
The caller would use xen_dm_op_modified_memory modified_memory, and due
to the memset(&op, 0, sizeof(op));, the value for pfns_processed would
end up as zero. Xen could use expanded structure, and make use of
pfns_processed.
Or, we could re-purpose the 'pad' member of struct xen_dm_op, call it
continuation_data, and have this as a general purpose continuation
variable, that the caller wouldn't pay any attention to.
What do you think?
>> --- a/xen/arch/x86/hvm/dm.c
>> +++ b/xen/arch/x86/hvm/dm.c
>> @@ -119,56 +119,89 @@ static int set_isa_irq_level(struct domain *d, uint8_t isa_irq,
>> }
>>
>> static int modified_memory(struct domain *d,
>> - struct xen_dm_op_modified_memory *data)
>> + struct xen_dm_op_modified_memory *header,
>> + xen_dm_op_buf_t* buf)
>> {
>> - xen_pfn_t last_pfn = data->first_pfn + data->nr - 1;
>> - unsigned int iter = 0;
>> + /* Process maximum of 256 pfns before checking for continuation */
>> + const unsigned int cont_check_interval = 0xff;
> Iirc the question was asked on v1 already: 256 (as the comment
> says) or 255 (as the value enforces)?
aw shucks! I had it as 255, and Paul said "255 or 256?" and I changed
it to 256, without thinking, assuming an off by one error, and that he
was right. But, with a seconds thought, it should be 255, and Paul was
actually referring to a comment later in the code, which was 256.
>> int rc = 0;
>> if ( copy_from_guest_offset(&extent, buf->h, rem_extents - 1, 1) )
>> + return -EFAULT;
>> +
>> + pfn = extent.first_pfn + header->pfns_processed;
>> + batch_nr = extent.nr - header->pfns_processed;
> Even if this looks to be harmless to the hypervisor, I dislike the
> overflows you allow for here.
I'll add a check for (header->pfns_processed > extent.nr), returning
-EINVAL.
>> @@ -441,13 +474,8 @@ static int dm_op(domid_t domid,
>> struct xen_dm_op_modified_memory *data =
>> &op.u.modified_memory;
>>
>> - const_op = false;
>> -
>> - rc = -EINVAL;
>> - if ( data->pad )
>> - break;
> Where did this check go?
data->pad is undefined here. But I could add a check for extent.pad in
the modified_memory function.
>> --- a/xen/include/public/hvm/dm_op.h
>> +++ b/xen/include/public/hvm/dm_op.h
>> @@ -237,13 +237,24 @@ struct xen_dm_op_set_pci_link_route {
>> * XEN_DMOP_modified_memory: Notify that a set of pages were modified by
>> * an emulator.
>> *
>> - * NOTE: In the event of a continuation, the @first_pfn is set to the
>> - * value of the pfn of the remaining set of pages and @nr reduced
>> - * to the size of the remaining set.
>> + * DMOP buf 1 contains an array of xen_dm_op_modified_memory_extent with
>> + * @nr_extent entries. The value of @nr_extent will be reduced on
>> + * continuation.
>> + *
>> + * @pfns_processed is used for continuation, and not intended to be usefull
>> + * to the caller. It gives the number of pfns already processed (and can
>> + * be skipped) in the last extent. This should always be set to zero.
>> */
>> #define XEN_DMOP_modified_memory 11
>>
>> struct xen_dm_op_modified_memory {
>> + /* IN - number of extents. */
>> + uint32_t nr_extents;
>> + /* IN - should be set to 0, and is updated on continuation. */
>> + uint32_t pfns_processed;
> I'd prefer if the field (if to be kept) wasn't named after its current
> purpose, nor should we state anything beyond it needing to be
> zero when first invoking the call. These are implementation details
> which callers should not rely on.
Assuming we keep it, how about calling it "reserved", with a comment in
dm.c explaining what its actually for?
Let me know what you think,
Cheers,
-jenny
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-03-22 19:55 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-21 13:59 [PATCH v2] dm_op: Add xendevicemodel_modified_memory_bulk Jennifer Herbert
2017-03-22 10:33 ` Jan Beulich
2017-03-22 19:55 ` Jennifer Herbert [this message]
2017-03-23 8:35 ` Jan Beulich
2017-03-23 14:54 ` Jennifer Herbert
2017-03-23 15:58 ` Jan Beulich
2017-03-23 18:44 ` Jennifer Herbert
2017-03-24 7:21 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=58D2D6BE.7000705@citrix.com \
--to=jennifer.herbert@citrix.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=ian.jackson@eu.citrix.com \
--cc=paul.durrant@citrix.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).