From: Yu Zhang <yu.c.zhang@linux.intel.com>
To: George Dunlap <george.dunlap@citrix.com>
Cc: Paul Durrant <Paul.Durrant@citrix.com>,
"Lv, Zhiyuan" <zhiyuan.lv@intel.com>,
Jan Beulich <JBeulich@suse.com>,
"Xen-devel@lists.xen.org" <Xen-devel@lists.xen.org>
Subject: Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
Date: Mon, 26 Sep 2016 14:58:07 +0800 [thread overview]
Message-ID: <57E8C6FF.3050708@linux.intel.com> (raw)
In-Reply-To: <57E3A06E.1020000@linux.intel.com>
On 9/22/2016 5:12 PM, Yu Zhang wrote:
>
>
> On 9/21/2016 9:04 PM, George Dunlap wrote:
>> On Fri, Sep 9, 2016 at 6:51 AM, Yu Zhang <yu.c.zhang@linux.intel.com>
>> wrote:
>>>> On 9/2/2016 6:47 PM, Yu Zhang wrote:
>>>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>>>>> let one ioreq server claim/disclaim its responsibility for the
>>>>> handling of guest pages with p2m type p2m_ioreq_server. Users
>>>>> of this HVMOP can specify which kind of operation is supposed
>>>>> to be emulated in a parameter named flags. Currently, this HVMOP
>>>>> only support the emulation of write operations. And it can be
>>>>> further extended to support the emulation of read ones if an
>>>>> ioreq server has such requirement in the future.
>>>>>
>>>>> For now, we only support one ioreq server for this p2m type, so
>>>>> once an ioreq server has claimed its ownership, subsequent calls
>>>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>>>>> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
>>>>> triggering this new HVMOP, with ioreq server id set to the current
>>>>> owner's and flags parameter set to 0.
>>>>>
>>>>> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>>>>> are only supported for HVMs with HAP enabled.
>>>>>
>>>>> Also note that only after one ioreq server claims its ownership
>>>>> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
>>>>> be allowed.
>>>>>
>>>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>>>> Acked-by: Tim Deegan <tim@xen.org>
>>>>> ---
>>>>> Cc: Paul Durrant <paul.durrant@citrix.com>
>>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>>>> Cc: Tim Deegan <tim@xen.org>
>>>>>
>>>>> changes in v6:
>>>>> - Clarify logic in hvmemul_do_io().
>>>>> - Use recursive lock for ioreq server lock.
>>>>> - Remove debug print when mapping ioreq server.
>>>>> - Clarify code in ept_p2m_type_to_flags() for consistency.
>>>>> - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>>>>> - Add comments for HVMMEM_ioreq_server to note only changes
>>>>> to/from HVMMEM_ram_rw are permitted.
>>>>> - Add domain_pause/unpause() in
>>>>> hvm_map_mem_type_to_ioreq_server()
>>>>> to avoid the race condition when a vm exit happens on a write-
>>>>> protected page, just to find the ioreq server has been unmapped
>>>>> already.
>>>>> - Introduce a seperate patch to delay the release of p2m
>>>>> lock to avoid the race condition.
>>>>> - Introduce a seperate patch to handle the read-modify-write
>>>>> operations on a write protected page.
>>>>>
>>>> Why do we need to do this? Won't the default case just DTRT if it
>>>> finds
>>>> that the ioreq server has been unmapped?
>>>
>>> Well, patch 4 will either mark the remaining p2m_ioreq_server
>>> entries as
>>> "recal" or
>>> reset to p2m_ram_rw directly. So my understanding is that we do not
>>> wish to
>>> see a ept violation due to a p2m_ioreq_server access after the ioreq
>>> server
>>> is unmapped.
>>> Yet without this domain_pause/unpause() pair, VM accesses may
>>> trigger an ept
>>> violation
>>> during the hvmop hypercall(hvm_map_mem_type_to_ioreq_server), just
>>> to find
>>> the ioreq
>>> server is NULL. Then we would have to provide handlers which just do
>>> the
>>> copy to/from
>>> actions for the VM. This seems awkward to me.
>> So the race you're worried about is this:
>>
>> 1. Guest fault happens
>> 2. ioreq server calls map_mem_type_to_ioreq_server, unhooking
>> 3. guest finds no ioreq server present
>>
>> I think in that case the easiest thing to do would be to simply assume
>> there was a race and re-execute the instruction. Is that not possible
>> for some reason?
>>
>> -George
>
> Thanks for your reply, George. :)
> Two reasons I'd like to use the domain_pause/unpause() to avoid the
> race condition:
>
> 1> Like my previous explanation, in the read-modify-write scenario,
> the ioreq server will
> be NULL for the read emulation. But in such case, hypervisor will not
> discard this trap, instead
> it is supposed to do the copy work for the read access. So it would be
> difficult for hypervisor
> to decide if the ioreq server was detached due to a race condition, or
> if the ioreq server should
> be a NULL because we are emulating a read operation first for a
> read-modify-write instruction.
>
> 2> I also realized it can avoid a deadlock possibility -
> a> dom0 triggers map_mem_type_to_ioreq_server, which spin locks
> the ioreq_server.lock,
> and before routine hvm_map_mem_type_to_ioreq_server() returns,
> b> The HVM triggers an ept violation, and p2m lock is held by
> hvm_hap_nested_page_fault();
> c> hypervisor continues to the I/O handler, which will need the
> ioreq_server.lock to select
> an ioreq server, which is being held by the hypercall handler
> side;
> d> likewise, the map_mem_type_to_ioreq_server will meet problem
> when trying to get the
> p2m lock when it calls p2m_change_entry_type_global().
> With a domain_pause/unpause(), we could avoid the ept violation
> during this period(which I
> believe won't be long because this pair only covers the
> p2m_change_entry_type_global() call, not
> the p2m_finish_type_change() call).
Sorry, the deadlock issue should be between the p2m->ioreq.lock and the
p2m lock which are both
inside p2m_set_ioreq_server() called by hvm_map_mem_type_to_ioreq_server().
The sequence when deadlock happens is similar to the above description:
a> dom0 triggers map_mem_type_to_ioreq_server, to unmap the
p2m_ioreq_server type;
b> The HVM triggers an ept violation, and p2m lock is held by
hvm_hap_nested_page_fault();
c> hypervisor continues to the I/O handler, which will need the
p2m->ioreq.lock to select
an ioreq server for the write protected address, which is being
held by the hypercall handler
side;
d> likewise, p2m_set_ioreq_server() will meet problem when trying to get
the p2m lock when
it calls p2m_change_entry_type_global().
Now I believe we can avoid this deadlock by moving the
p2m_change_entry_type_global() outside
the p2m_set_ioreq_server() to its caller
hvm_map_mem_type_to_ioreq_server(), and there would
be no nested lock between the p2m->ioreq.lock and p2m lock. :)
A code snippet:
@@ -944,6 +944,13 @@ int hvm_map_mem_type_to_ioreq_server(struct domain
*d, ioservid_t id,
if ( s->id == id )
{
rc = p2m_set_ioreq_server(d, flags, s);
+ if ( rc == 0 && flags == 0 )
+ {
+ struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+ if ( read_atomic(&p2m->ioreq.entry_count) )
+ p2m_change_entry_type_global(d, p2m_ioreq_server,
p2m_ram_rw);
+ }
break;
}
}
Thanks
Yu
>
> Thanks
> Yu
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-09-26 6:58 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-02 10:47 [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2016-09-02 10:47 ` [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2016-09-05 13:31 ` Jan Beulich
2016-09-05 17:20 ` George Dunlap
2016-09-06 7:58 ` Jan Beulich
2016-09-06 8:03 ` Paul Durrant
2016-09-06 8:13 ` Jan Beulich
2016-09-06 10:00 ` Yu Zhang
2016-09-09 5:55 ` Yu Zhang
2016-09-09 8:09 ` Jan Beulich
2016-09-09 8:59 ` Yu Zhang
2016-09-05 17:23 ` George Dunlap
[not found] ` <57D24730.2050904@linux.intel.com>
2016-09-09 5:51 ` Yu Zhang
2016-09-21 13:04 ` George Dunlap
2016-09-22 9:12 ` Yu Zhang
2016-09-22 11:32 ` George Dunlap
2016-09-22 16:02 ` Yu Zhang
2016-09-23 10:35 ` George Dunlap
2016-09-26 6:57 ` Yu Zhang
2016-09-26 6:58 ` Yu Zhang [this message]
2016-09-02 10:47 ` [PATCH v6 2/4] x86/ioreq server: Release the p2m lock after mmio is handled Yu Zhang
2016-09-05 13:49 ` Jan Beulich
[not found] ` <57D24782.6010701@linux.intel.com>
2016-09-09 5:56 ` Yu Zhang
2016-09-02 10:47 ` [PATCH v6 3/4] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages Yu Zhang
2016-09-05 14:10 ` Jan Beulich
[not found] ` <57D247F6.9010503@linux.intel.com>
2016-09-09 6:21 ` Yu Zhang
2016-09-09 8:12 ` Jan Beulich
2016-09-02 10:47 ` [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
2016-09-05 14:47 ` Jan Beulich
[not found] ` <57D24813.2090903@linux.intel.com>
2016-09-09 7:24 ` Yu Zhang
2016-09-09 8:20 ` Jan Beulich
2016-09-09 9:24 ` Yu Zhang
2016-09-09 9:44 ` Jan Beulich
2016-09-09 9:56 ` Yu Zhang
2016-09-09 10:09 ` Jan Beulich
2016-09-09 10:01 ` Yu Zhang
2016-09-20 2:57 ` Yu Zhang
2016-09-22 18:06 ` George Dunlap
2016-09-23 1:31 ` Yu Zhang
2016-09-06 10:57 ` [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57E8C6FF.3050708@linux.intel.com \
--to=yu.c.zhang@linux.intel.com \
--cc=JBeulich@suse.com \
--cc=Paul.Durrant@citrix.com \
--cc=Xen-devel@lists.xen.org \
--cc=george.dunlap@citrix.com \
--cc=zhiyuan.lv@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).