Re: [PATCH 1/4] x86/domctl: Implement XEN_DOMCTL_{get, set}_vcpu_msrs

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>, Xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH 1/4] x86/domctl: Implement XEN_DOMCTL_{get, set}_vcpu_msrs
Date: Thu, 5 Jun 2014 14:01:45 +0100	[thread overview]
Message-ID: <53906A39.3070506@citrix.com> (raw)
In-Reply-To: <539082E20200007800018344@mail.emea.novell.com>

On 05/06/14 13:46, Jan Beulich wrote:
>>>> On 04.06.14 at 19:26, <andrew.cooper3@citrix.com> wrote:
>> Despite my 'Reviewed-by' tag on c/s 65e3554908 "x86/PV: support data
>> breakpoint extension registers", I have re-evaluated my position as far as 
>> the hypercall interface is concerned.
>>
>> Previously, for the sake of not modifying the migration code in libxc,
>> XEN_DOMCTL_get_ext_vcpucontext would jump though hoops to return -ENOBUFS if
>> and only if MSRs were in use and no buffer was present.
>>
>> This is fragile, and awkward from a toolstack point-of-view when actually
>> sending MSR content in the migration stream.  It also complicates fixing a
>> further race condition, between querying the number of MSRs for a vcpu, and
>> the vcpu touching a new one.
>>
>> As this code is still only in staging, take this opportunity to redesign the
> s/staging/unstable/

Oops - yes.

>
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -1341,6 +1341,132 @@ long arch_do_domctl(
>>      }
>>      break;
>>  
>> +    case XEN_DOMCTL_get_vcpu_msrs:
>> +    case XEN_DOMCTL_set_vcpu_msrs:
>> +    {
>> +        struct xen_domctl_vcpu_msrs *vmsrs = &domctl->u.vcpu_msrs;
>> +        struct xen_domctl_vcpu_msr msr;
>> +        struct vcpu *v;
>> +        uint32_t nr_msrs = 0;
>> +
>> +        ret = -ESRCH;
>> +        if ( (vmsrs->vcpu >= d->max_vcpus) ||
>> +             ((v = d->vcpu[vmsrs->vcpu]) == NULL) )
>> +            break;
>> +
>> +        ret = -EINVAL;
>> +        if ( (v == current) || /* no vcpu_pause() */
>> +             !is_pv_domain(d) )
>> +            break;
>> +
>> +        /* Count maximum number of optional msrs. */
>> +        if ( boot_cpu_has(X86_FEATURE_DBEXT) )
>> +            nr_msrs += 4;
>> +
>> +        if ( domctl->cmd == XEN_DOMCTL_get_vcpu_msrs )
>> +        {
>> +            /* NULL guest handle is a request for max size. */
>> +            if ( guest_handle_is_null(vmsrs->msrs) ||
>> +                 (vmsrs->msr_count < nr_msrs) )
>> +            {
>> +                vmsrs->msr_count = nr_msrs;
>> +                ret = guest_handle_is_null(vmsrs->msrs) ? 0 : -ENOBUFS;
> I don't think you should be failing "get" if there is enough space in
> the provided buffer to store the actually used number of MSRs. That
> way the caller may make a first call with a few (rather than none at
> all) entries, an grow the buffer only if this wasn't sufficient.

I am not sure I agree.  The MSRs are unordered in the buffer which the
caller cannot control, and issuing a hypercall again with a larger
buffer will rewrite it from the start again.

The sole use of this hypercall needs to ensure that all MSRs are gotten,
otherwise VM corruption will occur.  Permitting a partial get will make
the return value ambiguous for making this hypercall a single time and
guessing at the size to use, although I suspect we are less interested
in this problem.

>
>> +                }
>> +
>> +                vcpu_unpause(v);
>> +
>> +                /* Check we didn't lie to userspace then overflow the buffer */
>> +                BUG_ON(i > nr_msrs);
>> +                vmsrs->msr_count = i;
>> +            }
>> +
>> +            copyback = 1;
>> +        }
>> +        else
>> +        {
>> +            ret = -EINVAL;
>> +            if ( vmsrs->msr_count > nr_msrs )
>> +                break;
> Similarly I don't think you should fail the operation simply based on a
> too large count. Who knows when or why it may make sense for the
> caller to specify multiple entries with the same index.
>
> But then again the way you do it we have a statically determined
> maximum and don't need to worry about preemption here until a
> really large number of MSRs would get handled.

For now, this prevents looping to the end of a guest controlled
uint32_t, as per some of the concerns in XSA-77.

If in the future, there is a scenario where reloading the same MSR
several times becomes valid, the logic can be re-addressed.

>
>> +
>> +            vcpu_pause(v);
>> +
>> +            for ( i = 0; i < vmsrs->msr_count; ++i )
>> +            {
>> +                ret = -EFAULT;
>> +                if ( copy_from_guest_offset(&msr, vmsrs->msrs, i, 1) )
>> +                    break;
>> +
>> +                ret = -EINVAL;
>> +                if ( msr.reserved )
>> +                    break;
>> +
>> +                switch ( msr.index )
>> +                {
>> +                case MSR_AMD64_DR0_ADDRESS_MASK:
>> +                    if ( !boot_cpu_has(X86_FEATURE_DBEXT) ||
>> +                         (msr.value >> 32) )
>> +                        break;
>> +                    v->arch.pv_vcpu.dr_mask[0] = msr.value;
>> +                    continue;
>> +
>> +                case MSR_AMD64_DR1_ADDRESS_MASK ...
>> +                    MSR_AMD64_DR3_ADDRESS_MASK:
>> +                    if ( !boot_cpu_has(X86_FEATURE_DBEXT) ||
>> +                         (msr.value >> 32) )
>> +                        break;
>> +                    msr.index -= MSR_AMD64_DR1_ADDRESS_MASK - 1;
>> +                    v->arch.pv_vcpu.dr_mask[msr.index] = msr.value;
>> +                    continue;
>> +                }
>> +                break;
>> +            }
>> +
>> +            vcpu_unpause(v);
>> +
>> +            if ( i == vmsrs->msr_count )
>> +                ret = 0;
> else {
>     vmsrs->msr_count = i
>     copyback = 1;
> }
>
> to at once give the caller an indication which slot failed?
>
> Jan

Ok - seems useful as a debugging measure.

~Andrew

next prev parent reply	other threads:[~2014-06-05 13:01 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-04 17:26 [PATCH 0/4] Fixes to several domctls for migration Andrew Cooper
2014-06-04 17:26 ` [PATCH 1/4] x86/domctl: Implement XEN_DOMCTL_{get, set}_vcpu_msrs Andrew Cooper
2014-06-05 12:46   ` Jan Beulich
2014-06-05 13:01     ` Andrew Cooper [this message]
2014-06-05 13:33       ` Jan Beulich
2014-06-06 14:53         ` Andrew Cooper
2014-06-06 15:09           ` Jan Beulich
2014-06-06 15:28             ` Andrew Cooper
2014-06-04 17:26 ` [PATCH 2/4] tools/libxc: Use an explicit check for PV MSRs in xc_domain_save() Andrew Cooper
2014-06-05 13:41   ` Jan Beulich
2014-06-05 15:52   ` Ian Campbell
2014-06-05 15:57     ` Andrew Cooper
2014-06-06  9:15       ` Ian Campbell
2014-06-06  9:44         ` Andrew Cooper
2014-06-06  9:48           ` Ian Campbell
2014-06-04 17:26 ` [PATCH 3/4] x86/domctl: Remove PV MSR parts of XEN_DOMCTL_[gs]et_ext_vcpucontext Andrew Cooper
2014-06-05  7:52   ` Frediano Ziglio
2014-06-05  9:25     ` Andrew Cooper
2014-06-04 17:26 ` [PATCH 4/4] x86/domctl: Two functional fixes to XEN_DOMCTL_[gs]etvcpuextstate Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53906A39.3070506@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).