From: Yu Zhang <yu.c.zhang@linux.intel.com>
To: George Dunlap <george.dunlap@citrix.com>,
Jan Beulich <JBeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>,
George Dunlap <george.dunlap@eu.citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Tim Deegan <tim@xen.org>,
xen-devel@lists.xen.org, Paul Durrant <paul.durrant@citrix.com>,
zhiyuan.lv@intel.com, JunNakajima <jun.nakajima@intel.com>
Subject: Re: [PATCH v4 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
Date: Thu, 23 Jun 2016 15:37:27 +0800 [thread overview]
Message-ID: <576B91B7.9070204@linux.intel.com> (raw)
In-Reply-To: <3f202dab-e36b-7b52-8ad7-1fcf0d0a15ef@citrix.com>
On 6/22/2016 7:33 PM, George Dunlap wrote:
> On 22/06/16 11:07, Yu Zhang wrote:
>>
>> On 6/22/2016 5:47 PM, George Dunlap wrote:
>>> On 22/06/16 10:29, Jan Beulich wrote:
>>>>>>> On 22.06.16 at 11:16, <george.dunlap@citrix.com> wrote:
>>>>> On 22/06/16 07:39, Jan Beulich wrote:
>>>>>>>>> On 21.06.16 at 16:38, <george.dunlap@citrix.com> wrote:
>>>>>>> On 21/06/16 10:47, Jan Beulich wrote:
>>>>>>>>>>>> And then - didn't we mean to disable that part of XenGT during
>>>>>>>>>>>> migration, i.e. temporarily accept the higher performance
>>>>>>>>>>>> overhead without the p2m_ioreq_server entries? In which case
>>>>>>>>>>>> flipping everything back to p2m_ram_rw after (completed or
>>>>>>>>>>>> canceled) migration would be exactly what we want. The (new
>>>>>>>>>>>> or previous) ioreq server should attach only afterwards, and
>>>>>>>>>>>> can then freely re-establish any p2m_ioreq_server entries it
>>>>>>>>>>>> deems necessary.
>>>>>>>>>>>>
>>>>>>>>>>> Well, I agree this part of XenGT should be disabled during
>>>>>>>>>>> migration.
>>>>>>>>>>> But in such
>>>>>>>>>>> case I think it's device model's job to trigger the p2m type
>>>>>>>>>>> flipping(i.e. by calling
>>>>>>>>>>> HVMOP_set_mem_type).
>>>>>>>>>> I agree - this would seem to be the simpler model here, despite
>>>>>>>>>> (as
>>>>>>>>>> George validly says) the more consistent model would be for the
>>>>>>>>>> hypervisor to do the cleanup. Such cleanup would imo be reasonable
>>>>>>>>>> only if there was an easy way for the hypervisor to enumerate all
>>>>>>>>>> p2m_ioreq_server pages.
>>>>>>>>> Well, for me, the "easy way" means we should avoid traversing
>>>>>>>>> the whole ept
>>>>>>>>> paging structure all at once, right?
>>>>>>>> Yes.
>>>>>>> Does calling p2m_change_entry_type_global() not satisfy this
>>>>>>> requirement?
>>>>>> Not really - that addresses the "low overhead" aspect, but not the
>>>>>> "enumerate all such entries" one.
>>>>> I'm sorry, I think I'm missing something here. What do we need the
>>>>> enumeration for?
>>>> We'd need that if we were to do the cleanup in the hypervisor (as
>>>> we can't rely on all p2m entry re-calculation to have happened by
>>>> the time a new ioreq server registers for the type).
>>> So you're afraid of this sequence of events?
>>> 1) Server A de-registered, triggering a ioreq_server -> ram_rw type
>>> change
>>> 2) gfn N is marked as misconfigured
>>> 3) Server B registers and marks gfn N as ioreq_server
>>> 4) When N is accessed, the misconfiguration is resolved incorrectly to
>>> ram_rw
>>>
>>> But that can't happen, because misconfigured entries are resolved before
>>> setting a p2m entry; so at step 3, gfn N will be first set to
>>> (non-misconfigured) ram_rw, then changed to (non-misconfigured)
>>> ioreq_server.
>>>
>>> Or is there another sequence of events that I'm missing?
>> Thanks for your reply, George. :)
>> If no log dirty is triggered during this process, your sequence is correct.
>> However, if log dirty is triggered, we'll met problems. I have described
>> this
>> in previous mails :
>>
>> http://lists.xenproject.org/archives/html/xen-devel/2016-06/msg02426.html
>> on Jun 20
>>
>> and
>>
>> http://lists.xenproject.org/archives/html/xen-devel/2016-06/msg02575.html
>> on Jun 21
> Right -- sorry, now I see the issue:
>
> 1. Server A marks gfn X as ioreq_server
> 2. Server A deregisters, gfn X misconfigured
> 3. Server B registers, marks gfn Y as ioreq_server
> 4. Logdirty mode enabled; gfn Y misconfigured
> 5. When X or Y are accessed, resolve_misconfigure() has no way of
> telling whether the entry is from server A (which should be set to
> logdirty) or from server B (which should be left as ioreq_server).
Exactly. :)
Another simpler scenario would be
1. Server A marks gfn X as p2m_ioreq_server;
2. Logdirty mode enabled; gfn X misconfigured;
3. When X is written, it will not cause ept vioalation, but ept
misconfig, and the
resolve_misconfig() would set gfn X back to p2m_ram_rw, thereafter we
can not
track access to X;
Note: Not resetting the p2m type for p2m_ioreq_server when
p2m->ioreq_server is
not NULL is suitable for this simpler scenario, but is not correct if
take your scenario
into account.
The core reason is I could not find a simple solution in
resolve_misconfig() to handle
handle both the outdated p2m_ioreq_server entries, the in-use ones and
to support
the logdirty feature at the same time.
> In a sense this is a deficiency in the change_entry_type_global()
> interface. A common OS principle is "make the common case fast, and the
> uncommon case correct". The scenario described above seems to me to be
> an uncommon case which is handled quickly but incorrectly; ideally we
> should handle it correctly, even if it's not very quick.
>
> Synchronously resolving a previous misconfig is probably the most
> straightforward thing to do. It could be done at point #3, when an M->N
> type change is not complete and a new p2m entry of type M is written; it
> could be at point #4, when an N->O type change is initiated while an
> M->N type change hasn't completed. Or it could be when an N->O type
> change happens while there are unfinished M->N transitions *and*
> post-type-change M entries.
Sorry, I did not quite get it. Could you please elaborate more? Thanks! :)
>
> But, that's a lot of somewhat complicated work for a scenario that is
> unlikely to happen in practice, and I can see why Yu Zhang would feel
> reluctant to do it.
Yes, that's part of my reasons. :)
>
> For the time being though, this will fail at #4, right? That is,
> logdirty mode cannot be enabled while server B is registered?
>
> That does mean we'd be forced to sort out the situation before we allow
> logdirty and ioreq_server to be used at the same time, but that doesn't
> really seem like such a bad idea to me.
One solution I thought of is to just return failure in
hap_enable_log_dirty()
if p2m->ioreq.server is not NULL. But I did not choose such approach,
because:
1> I still want to keep the logdirty feature so that XenGT can use it to
keep track
of the dirty rams when we support live migration in the future;
2> I also agree with Paul's argument: it is device model's duty to do
the p2m type
resetting work.
> I'm still open to being convinced, but at the moment it really seems to
> me like improving the situation is the better long-term option.
>
Thanks for all your advices, George. I'm also willing to taking other
advices, if we have
a more acceptable(for you, Jan and other maintainers) resync approach in
hypervisor,
I'd like to add this. If the code is too complicated, I can submit it
in a separate new
patchset. :)
B.R.
Yu
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-06-23 7:37 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-19 9:05 [PATCH v4 0/3] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2016-05-19 9:05 ` [PATCH v4 1/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
2016-06-14 10:04 ` Jan Beulich
2016-06-14 13:14 ` George Dunlap
2016-06-15 10:51 ` Yu Zhang
2016-05-19 9:05 ` [PATCH v4 2/3] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
2016-05-19 9:05 ` [PATCH v4 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2016-06-14 10:45 ` Jan Beulich
2016-06-14 13:13 ` George Dunlap
2016-06-14 13:31 ` Jan Beulich
2016-06-15 9:50 ` George Dunlap
2016-06-15 10:21 ` Jan Beulich
2016-06-15 11:28 ` George Dunlap
2016-06-16 9:30 ` Yu Zhang
2016-06-16 9:55 ` Jan Beulich
2016-06-17 10:17 ` George Dunlap
2016-06-20 9:03 ` Yu Zhang
2016-06-20 10:10 ` George Dunlap
2016-06-20 10:25 ` Jan Beulich
2016-06-20 10:32 ` George Dunlap
2016-06-20 10:55 ` Jan Beulich
2016-06-20 11:28 ` Yu Zhang
2016-06-20 13:13 ` George Dunlap
2016-06-21 7:42 ` Yu Zhang
2016-06-20 10:30 ` Yu Zhang
2016-06-20 10:43 ` George Dunlap
2016-06-20 10:45 ` Jan Beulich
2016-06-20 11:06 ` Yu Zhang
2016-06-20 11:20 ` Jan Beulich
2016-06-20 12:06 ` Yu Zhang
2016-06-20 13:38 ` Jan Beulich
2016-06-21 7:45 ` Yu Zhang
2016-06-21 8:22 ` Jan Beulich
2016-06-21 9:16 ` Yu Zhang
2016-06-21 9:47 ` Jan Beulich
2016-06-21 10:00 ` Yu Zhang
2016-06-21 14:38 ` George Dunlap
2016-06-22 6:39 ` Jan Beulich
2016-06-22 8:38 ` Yu Zhang
2016-06-22 9:11 ` Jan Beulich
2016-06-22 9:16 ` George Dunlap
2016-06-22 9:29 ` Jan Beulich
2016-06-22 9:47 ` George Dunlap
2016-06-22 10:07 ` Yu Zhang
2016-06-22 11:33 ` George Dunlap
2016-06-23 7:37 ` Yu Zhang [this message]
2016-06-23 10:33 ` George Dunlap
2016-06-24 4:16 ` Yu Zhang
2016-06-24 6:12 ` Jan Beulich
2016-06-24 7:12 ` Yu Zhang
2016-06-24 8:01 ` Jan Beulich
2016-06-24 9:57 ` Yu Zhang
2016-06-24 10:27 ` Jan Beulich
2016-06-22 10:10 ` Jan Beulich
2016-06-22 10:15 ` George Dunlap
2016-06-22 11:50 ` Jan Beulich
2016-06-15 10:52 ` Yu Zhang
2016-06-15 12:26 ` Jan Beulich
2016-06-16 9:32 ` Yu Zhang
2016-06-16 10:02 ` Jan Beulich
2016-06-16 11:18 ` Yu Zhang
2016-06-16 12:43 ` Jan Beulich
2016-06-20 9:05 ` Yu Zhang
2016-06-14 13:14 ` George Dunlap
2016-05-27 7:52 ` [PATCH v4 0/3] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Zhang, Yu C
2016-05-27 10:00 ` Jan Beulich
2016-05-27 9:51 ` Zhang, Yu C
2016-05-27 10:02 ` George Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=576B91B7.9070204@linux.intel.com \
--to=yu.c.zhang@linux.intel.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=george.dunlap@citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=paul.durrant@citrix.com \
--cc=tim@xen.org \
--cc=xen-devel@lists.xen.org \
--cc=zhiyuan.lv@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.