Re: [PATCH V2 4/5] xen, libxc: Request page fault injection via libxc

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Razvan Cojocaru <rcojocaru@bitdefender.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Cc: "keir@xen.org" <keir@xen.org>,
	"ian.campbell@citrix.com" <ian.campbell@citrix.com>,
	"stefano.stabellini@eu.citrix.com"
	<stefano.stabellini@eu.citrix.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	"Dong, Eddie" <eddie.dong@intel.com>,
	"ian.jackson@eu.citrix.com" <ian.jackson@eu.citrix.com>,
	"tim@xen.org" <tim@xen.org>,
	"jbeulich@suse.com" <jbeulich@suse.com>,
	"andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [PATCH V2 4/5] xen, libxc: Request page fault injection via libxc
Date: Fri, 05 Sep 2014 12:19:40 +0300	[thread overview]
Message-ID: <5409802C.90103@bitdefender.com> (raw)
In-Reply-To: <AADFC41AFE54684AB9EE6CBC0274A5D12606376E@SHSMSX101.ccr.corp.intel.com>

On 09/04/2014 08:08 PM, Tian, Kevin wrote:
>> From: Razvan Cojocaru [mailto:rcojocaru@bitdefender.com]
>> Sent: Thursday, September 04, 2014 10:02 AM
>>
>> On 09/04/14 19:56, Razvan Cojocaru wrote:
>>> On 09/04/14 18:58, Tian, Kevin wrote:
>>>>> From: Razvan Cojocaru [mailto:rcojocaru@bitdefender.com]
>>>>> Sent: Thursday, September 04, 2014 1:02 AM
>>>>>
>>>>> On 09/04/2014 10:54 AM, Razvan Cojocaru wrote:
>>>>>> On 09/04/2014 04:25 AM, Tian, Kevin wrote:
>>>>>>>> From: Razvan Cojocaru [mailto:rcojocaru@bitdefender.com]
>>>>>>>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>>>>>>>> index 5761ff9..5d3e4d4 100644
>>>>>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>>>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>>>>>> @@ -420,6 +420,31 @@ static bool_t hvm_wait_for_io(struct
>>>>>>>> hvm_ioreq_vcpu *sv, ioreq_t *p)
>>>>>>>>      return 1;
>>>>>>>>  }
>>>>>>>>
>>>>>>>> +static bool_t hvm_is_pf_requested(struct vcpu *v)
>>>>>>>> +{
>>>>>>>> +    const struct domain *d = v->domain;
>>>>>>>> +    struct segment_register seg;
>>>>>>>> +    uint64_t mask;
>>>>>>>> +
>>>>>>>> +    hvm_get_segment_register(v, x86_seg_ss, &seg);
>>>>>>>> +
>>>>>>>> +    if ( seg.attr.fields.dpl != 3 ) /* Guest is not in user mode */
>>>>>>>> +        return 0;
>>>>>>>> +
>>>>>>>> +    if ( hvm_long_mode_enabled(v) )
>>>>>>>> +        mask = PADDR_MASK & PAGE_MASK; /* Bits 51:12. */
>>>>>>>> +    else if ( hvm_pae_enabled(v) )
>>>>>>>> +        mask = 0x00000000ffffffe0; /* Bits 31:5. */
>>>>>>>> +    else
>>>>>>>> +        mask = (uint32_t)PAGE_MASK; /* Bits 31:12. */
>>>>>>>> +
>>>>>>>> +    if ( (v->arch.hvm_vcpu.guest_cr[3] & mask) !=
>>>>>>>> +         (d->arch.hvm_domain.inject_trap.cr3 & mask) )
>>>>>>>> +        return 0;
>>>>>>>> +
>>>>>>>> +    return 1;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>  void hvm_do_resume(struct vcpu *v)
>>>>>>>>  {
>>>>>>>>      struct domain *d = v->domain;
>>>>>>>> @@ -451,6 +476,15 @@ void hvm_do_resume(struct vcpu *v)
>>>>>>>>      }
>>>>>>>>
>>>>>>>>      /* Inject pending hw/sw trap */
>>>>>>>
>>>>>>> you want to make comment more introspection related.
>>>>>>>
>>>>>>>> +    if ( d->arch.hvm_domain.inject_trap.vector != -1 &&
>>>>>>>> +         v->arch.hvm_vcpu.inject_trap.vector == -1 &&
>>>>>>>> +         hvm_is_pf_requested(v) )
>>>>>>>> +    {
>>>>>>>> +        hvm_inject_trap(&d->arch.hvm_domain.inject_trap);
>>>>>>>> +        d->arch.hvm_domain.inject_trap.vector = -1;
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>> +    /* Inject pending hw/sw trap */
>>>>>>>>      if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )
>>>>>>>>      {
>>>>>>>>          hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap);
>>>>>>>> @@ -1473,9 +1507,10 @@ int hvm_domain_initialise(struct domain
>> *d)
>>>>>>>>              printk(XENLOG_G_INFO "PVH guest must have HAP
>>>>> on\n");
>>>>>>>>              return -EINVAL;
>>>>>>>>          }
>>>>>>>> -
>>>>>>>>      }
>>>>>>>>
>>>>>>>> +    d->arch.hvm_domain.inject_trap.vector = -1;
>>>>>>>> +
>>>>>>>
>>>>>>> A question here. With new design, now you have two places which may
>>>>> cache
>>>>>>> fault injection information. If unfortunately two consecutive hypercalls
>> with
>>>>>>> one having vcpuid=-1 and the other having vcpuid=n, it's possible two
>>>>> injections
>>>>>>> will be handled together if same vcpu is chosen for two pending
>> injections.
>>>>>>>
>>>>>>> I think the hypercall needs a check of all previous pending requests, not
>>>>> only
>>>>>>> in domain specific structure as what you did in this version.
>>>>>
>>>>> If two consecutive hypercalls with a wildcard, and then a specific
>>>>> vcpuid take place, the per-domain injection data will indeed be set
>>>>> along with the VCPU-specific injection data.
>>>>
>>>> I'm thinking one hypercall with a wildchard, while another hypercall
>>>> with a specific vcpuid.
>>>
>>> If I understand this correctly, that's what my original design did, with
>>> xc_domain_request_pagefault(). Please see:
>>>
>>> http://lists.xen.org/archives/html/xen-devel/2014-08/msg02782.html
>>>
>>> But it has been argued that Xen already has HVMOP_inject_trap and that a
>>> new hypercall should only be added if it's not possible to extend the
>>> existing one.
>>
>> I might have misunderstood your comment - you were probably just
>> clarifying the sequence of hypercall calls, not suggesting implementing
>> a separate hypercall for the wildcard case.
>>
>>
> 
> Exactly. Just want to see a consistent policy. In current hypercall implementation,
> you check pending domain structure which implies only one pending injection
> allowed. But with vcpu structure it's possible to have two or even more pending
> in the same time. Need clarify your policy and make it consistent. :-)

I've modified the code to do this:

        rc = -ENOENT;

        if ( tr.vcpuid == (uint32_t)~0 ) /* Any VCPU. */
        {
            unsigned int i;

            for ( i = 0; i < d->max_vcpus; i++ )
                if ( (v = d->vcpu[i]) != NULL &&
                     v->arch.hvm_vcpu.inject_trap.vector == -1 )
                {
                    printk("Busy cpu\n");
                    rc = -EBUSY;
                    break;
                }

            if ( d->arch.hvm_domain.inject_trap.vector != -1 )
            {
                printk("Busy domain\n");
                rc = -EBUSY;
            }

            if ( rc != -EBUSY )
            {
                d->arch.hvm_domain.inject_trap.vector = tr.vector;
                d->arch.hvm_domain.inject_trap.type = tr.type;
                d->arch.hvm_domain.inject_trap.error_code = tr.error_code;
                d->arch.hvm_domain.inject_trap.insn_len = tr.insn_len;
                d->arch.hvm_domain.inject_trap.cr2 = tr.cr2;
                d->arch.hvm_domain.inject_trap.cr3 = tr.cr3;
                rc = 0;
            }
        }
        else /* Specific VCPU. */
        {
            if ( tr.vcpuid >= d->max_vcpus || (v = d->vcpu[tr.vcpuid])
== NULL )
                goto param_fail8;

            if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ||
                 d->arch.hvm_domain.inject_trap.vector != -1 )
                rc = -EBUSY;
            else
            {
                v->arch.hvm_vcpu.inject_trap.vector = tr.vector;
                v->arch.hvm_vcpu.inject_trap.type = tr.type;
                v->arch.hvm_vcpu.inject_trap.error_code = tr.error_code;
                v->arch.hvm_vcpu.inject_trap.insn_len = tr.insn_len;
                v->arch.hvm_vcpu.inject_trap.cr2 = tr.cr2;
                v->arch.hvm_vcpu.inject_trap.cr3 = tr.cr3;
                rc = 0;
            }
        }

And unfortunately, in our scenario we get "Busy cpu" every time. So
rejecting the per-domain trap injection if at least one VCPU has a
pending trap makes the patch useless for the purpose it's been created
for. The original implementation that allows adding a pending per-domain
trap injection request did end up injecting the trap.

I'm not sure why there's always a pending trap injection on a VCPU at
the time of our call, but I suspect it's because of the mem_event
mechanism - we always do an HVMOP_inject_trap from a mem_event handler.
In any case, it always return -EBUSY with the new policy.

The original policy is to not allow more than one pending per-domain
trap injection request, or more than one pending per-VCPU trap injection
request - but to allow a per-domain pending injection request on top of
any VCPU-specific request. This is what the tests in the original code
ensure happens, so I'm not sure where the policy is inconsistent. The
way I've read the code, in that event both traps are injected - first
the VCPU one, and then, if conditions are met, the per-domain one later
also. Is that incorrect?


Thanks,
Razvan Cojocaru

next prev parent reply	other threads:[~2014-09-05  9:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-03 11:20 [PATCH V2 1/5] xen: Emulate with no writes Razvan Cojocaru
2014-09-03 11:20 ` [PATCH V2 2/5] xen: Optimize introspection access to guest state Razvan Cojocaru
2014-09-03 11:20 ` [PATCH V2 3/5] xen, libxc: Force-enable relevant MSR events Razvan Cojocaru
2014-09-04  1:04   ` Tian, Kevin
2014-09-03 11:20 ` [PATCH V2 4/5] xen, libxc: Request page fault injection via libxc Razvan Cojocaru
2014-09-03 13:34   ` Ian Campbell
2014-09-04  1:25   ` Tian, Kevin
     [not found]     ` <54081ABA.8040007@bitdefender.com>
     [not found]       ` <54081C5D.2050206@bitdefender.com>
2014-09-04 15:58         ` Tian, Kevin
2014-09-04 16:56           ` Razvan Cojocaru
2014-09-04 17:02             ` Razvan Cojocaru
2014-09-04 17:08               ` Tian, Kevin
2014-09-05  9:19                 ` Razvan Cojocaru [this message]
2014-09-05  9:34                   ` Razvan Cojocaru
2014-09-04 17:06             ` Tian, Kevin
2014-09-03 11:20 ` [PATCH V2 5/5] xen: Handle resumed instruction based on previous mem_event reply Razvan Cojocaru

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5409802C.90103@bitdefender.com \
    --to=rcojocaru@bitdefender.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=eddie.dong@intel.com \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.