Re: [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Yu Zhang <yu.c.zhang@linux.intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Tim Deegan <tim@xen.org>,
	xen-devel@lists.xen.org, Paul Durrant <paul.durrant@citrix.com>,
	zhiyuan.lv@intel.com, Jun Nakajima <jun.nakajima@intel.com>
Subject: Re: [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.
Date: Fri, 24 Mar 2017 20:35:49 +0800	[thread overview]
Message-ID: <58D512A5.5070008@linux.intel.com> (raw)
In-Reply-To: <58D500D402000078001472C2@prv-mh.provo.novell.com>



On 3/24/2017 6:19 PM, Jan Beulich wrote:
>>>> On 24.03.17 at 10:05, <yu.c.zhang@linux.intel.com> wrote:
>> On 3/23/2017 4:57 PM, Jan Beulich wrote:
>>>>>> On 23.03.17 at 04:23, <yu.c.zhang@linux.intel.com> wrote:
>>>> On 3/22/2017 10:21 PM, Jan Beulich wrote:
>>>>>>>> On 21.03.17 at 03:52, <yu.c.zhang@linux.intel.com> wrote:
>>>>>> @@ -177,8 +178,64 @@ static int hvmemul_do_io(
>>>>>>             break;
>>>>>>         case X86EMUL_UNHANDLEABLE:
>>>>>>         {
>>>>>> -        struct hvm_ioreq_server *s =
>>>>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>>>>> +        /*
>>>>>> +         * Xen isn't emulating the instruction internally, so see if
>>>>>> +         * there's an ioreq server that can handle it. Rules:
>>>>>> +         *
>>>>>> +         * - PIO and "normal" MMIO run through hvm_select_ioreq_server()
>>>>>> +         * to choose the ioreq server by range. If no server is found,
>>>>>> +         * the access is ignored.
>>>>>> +         *
>>>>>> +         * - p2m_ioreq_server accesses are handled by the designated
>>>>>> +         * ioreq_server for the domain, but there are some corner
>>>>>> +         * cases:
>>>>>> +         *
>>>>>> +         *   - If the domain ioreq_server is NULL, assume there is a
>>>>>> +         *   race between the unbinding of ioreq server and guest fault
>>>>>> +         *   so re-try the instruction.
>>>>> And that retry won't come back here because of? (The answer
>>>>> should not include any behavior added by subsequent patches.)
>>>> You got me. :)
>>>> In this patch, retry will come back here. It should be after patch 4 or
>>>> patch 5 that the retry
>>>> will be ignored(p2m type changed back to p2m_ram_rw after the unbinding).
>>> In which case I think we shouldn't insist on you to change things, but
>>> you should spell out very clearly that this patch should not go in
>>> without the others going in at the same time.
>> So maybe it would be better we leave the retry part to a later patch,
>> say patch 4/5 or patch 5/5,
>> and return unhandleable in this patch?
> I don't follow. I've specifically suggested that you don't change
> the code, but simply state clearly the requirement that patches
> 2...5 of this series should all go in at the same time. I don't mind
> you making changes, but the risk then is that further round trips
> may be required because of there being new issues with the
> changes you may do.

Thanks, Jan. I'll keep the code, and add a note in the commit message of 
this patch.

>>>>>> --- a/xen/arch/x86/mm/hap/nested_hap.c
>>>>>> +++ b/xen/arch/x86/mm/hap/nested_hap.c
>>>>>> @@ -172,7 +172,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t L1_gpa, paddr_t *L0_gpa,
>>>>>>         if ( *p2mt == p2m_mmio_direct )
>>>>>>             goto direct_mmio_out;
>>>>>>         rc = NESTEDHVM_PAGEFAULT_MMIO;
>>>>>> -    if ( *p2mt == p2m_mmio_dm )
>>>>>> +    if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )
>>>>> Btw., how does this addition match up with the rc value being
>>>>> assigned right before the if()?
>>>> Well returning a NESTEDHVM_PAGEFAULT_MMIO in such case will trigger
>>>> handle_mmio() later in
>>>> hvm_hap_nested_page_fault(). Guess that is what we expected.
>>> That's probably what is expected, but it's no MMIO which we're
>>> doing in that case. And note that we've stopped abusing
>>> handle_mmio() for non-MMIO purposes a little while ago (commit
>>> 3dd00f7b56 ["x86/HVM: restrict permitted instructions during
>>> special purpose emulation"]).
>> OK. So what about we just remove this "*p2mt == p2m_ioreq_server"?
> Well, you must have had a reason to add it. To be honest, I don't
> care too much about the nested code (as it's far from production
> ready anyway), so leaving the code above untouched would be
> fine with me, but taking care of adjustments to nested code where
> they're actually needed would be even better. So the preferred
> option is for you to explain why you've done the change above,
> and why you think it's correct/needed. The next best option might
> be to drop the change.

Got it. I now prefer to drop the change. This code was added at the 
early stage of this patchset when
we hope p2m_ioreq_server can always trigger a handle_mmio(), but frankly 
we do not, and probably
will not use the nested case in the foreseeable future.

>>>>>> --- a/xen/arch/x86/mm/p2m-ept.c
>>>>>> +++ b/xen/arch/x86/mm/p2m-ept.c
>>>>>> @@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
>>>>>>                 entry->r = entry->w = entry->x = 1;
>>>>>>                 entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>>>>>>                 break;
>>>>>> +        case p2m_ioreq_server:
>>>>>> +            entry->r = 1;
>>>>>> +            entry->w = !(p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE);
>>>>> Is this effectively open coded p2m_get_ioreq_server() actually
>>>>> okay? If so, why does the function need to be used elsewhere,
>>>>> instead of doing direct, lock-free accesses?
>>>> Maybe your comments is about whether it is necessary to use the lock in
>>>> p2m_get_ioreq_server()?
>>>> I still believe so, it does not only protect the value of ioreq server,
>>>> but also the flag together with it.
>>>>
>>>> Besides, it is used not only in the emulation process, but also the
>>>> hypercall to set the mem type.
>>>> So the lock can still provide some kind protection against the
>>>> p2m_set_ioreq_server() - even it does
>>>> not always do so.
>>> The question, fundamentally, is about consistency: The same
>>> access model should be followed universally, unless there is an
>>> explicit reason for an exception.
>> Sorry, I do not quite understand. Why the consistency is broken?
> Because you don't call p2m_get_ioreq_server() here (discarding
> the return value, but using the flags).

Oh. I see. You are worrying about p2m->ioreq.server/flag being cleared 
due to an unmap.
Below are some situation I can think of...

>> I think this lock at least protects the ioreq server and the flag. The
>> only exception
>> is the one you mentioned - s could become stale which we agreed to let
>> the device
>> model do the check. Without this lock, things would become more complex
>> - more
>> race conditions...
> Sure, all understood. I wasn't really suggesting to drop the locked
> accesses, but instead I was using this to illustrate the non-locked
> access (and hence the inconsistency with other code) here. As
> said - if there's a good reason not to call the function here, I'm all
> ears.

... the ept_p2m_type_to_flags() is used in 2 cases:
1> in resolve_misconfig() to do the recalculation for a p2m entry - in 
such case, we won't
meet a p2m_ioreq_server type in ept_p2m_type_to_flags(), because it's 
already recalculated
back to p2m_ram_rw in the caller;

2> triggered by p2m_set_entry() which is trying to set mem type for some 
gfns. The only
scenario I can imagine(which is also an extreme one) that may have 
racing potential is:
during the mem type setting process, the ioreq server unmapping is 
triggered on another
cpu, which then invalidates the value p2m->ioreq.flag, in such case 
ept_p2m_type_to_flag()
will return an entry with writable permission. But right after the mem 
type setting is done,
p2m lock will be freed and the unmapping hypercall will get opportunity 
to reset this p2m
entry. We do not need a write-protected entry in such case anyway.

Besides, even the p2m_get_ioreq_server() is used here in 
ept_p2m_type_to_flags(), it can only
provide a limited protection, there's a chance that returned flag in 
ept_p2m_type_to_flags()
be outdated in above situation.

So, since we do not really care about the p2m entry permission in such 
extreme situation, and
we can not 100% guarantee the lock will protect it, I do not think we 
need to use the lock here.

I am not sure if this explanation is convincing to you, but I'm also 
open to be convinced. :-)

Thanks
Yu


> Jan
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2017-03-24 12:35 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-21  2:52 [PATCH v9 0/5] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2017-03-21  2:52 ` [PATCH v9 1/5] x86/ioreq server: Release the p2m lock after mmio is handled Yu Zhang
2017-03-29 13:39   ` George Dunlap
2017-03-29 13:50     ` Jan Beulich
2017-03-21  2:52 ` [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2017-03-22  7:49   ` Tian, Kevin
2017-03-22 10:12     ` Yu Zhang
2017-03-24  9:26       ` Tian, Kevin
2017-03-24 12:34         ` Yu Zhang
2017-03-22 14:21   ` Jan Beulich
2017-03-23  3:23     ` Yu Zhang
2017-03-23  8:57       ` Jan Beulich
2017-03-24  9:05         ` Yu Zhang
2017-03-24 10:19           ` Jan Beulich
2017-03-24 12:35             ` Yu Zhang [this message]
2017-03-24 13:09               ` Jan Beulich
2017-03-21  2:52 ` [PATCH v9 3/5] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages Yu Zhang
2017-03-22 14:22   ` Jan Beulich
2017-03-21  2:52 ` [PATCH v9 4/5] x86/ioreq server: Asynchronously reset outstanding p2m_ioreq_server entries Yu Zhang
2017-03-21 10:05   ` Paul Durrant
2017-03-22  8:10   ` Tian, Kevin
2017-03-22 10:12     ` Yu Zhang
2017-03-24  9:37       ` Tian, Kevin
2017-03-24 12:45         ` Yu Zhang
2017-03-22 14:29   ` Jan Beulich
2017-03-23  3:23     ` Yu Zhang
2017-03-23  9:00       ` Jan Beulich
2017-03-24  9:05         ` Yu Zhang
2017-03-24 10:37           ` Jan Beulich
2017-03-24 12:36             ` Yu Zhang
2017-03-21  2:52 ` [PATCH v9 5/5] x86/ioreq server: Synchronously reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
2017-03-21 10:00   ` Paul Durrant
2017-03-21 11:15     ` Yu Zhang
2017-03-21 13:49       ` Paul Durrant
2017-03-21 14:14         ` Yu Zhang
2017-03-22  8:28   ` Tian, Kevin
2017-03-22  8:54     ` Jan Beulich
2017-03-22  9:02       ` Tian, Kevin
2017-03-22 14:39   ` Jan Beulich
2017-03-23  3:23     ` Yu Zhang
2017-03-23  9:02       ` Jan Beulich
2017-03-24  9:05         ` Yu Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58D512A5.5070008@linux.intel.com \
    --to=yu.c.zhang@linux.intel.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=paul.durrant@citrix.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).