Re: [PATCH 1/4] x86/domctl: Implement XEN_DOMCTL_{get, set}_vcpu_msrs

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>, Xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH 1/4] x86/domctl: Implement XEN_DOMCTL_{get, set}_vcpu_msrs
Date: Thu, 5 Jun 2014 14:01:45 +0100	[thread overview]
Message-ID: <53906A39.3070506@citrix.com> (raw)
In-Reply-To: <539082E20200007800018344@mail.emea.novell.com>

On 05/06/14 13:46, Jan Beulich wrote:
>>>> On 04.06.14 at 19:26, <andrew.cooper3@citrix.com> wrote:
>> Despite my 'Reviewed-by' tag on c/s 65e3554908 "x86/PV: support data
>> breakpoint extension registers", I have re-evaluated my position as far as 
>> the hypercall interface is concerned.
>>
>> Previously, for the sake of not modifying the migration code in libxc,
>> XEN_DOMCTL_get_ext_vcpucontext would jump though hoops to return -ENOBUFS if
>> and only if MSRs were in use and no buffer was present.
>>
>> This is fragile, and awkward from a toolstack point-of-view when actually
>> sending MSR content in the migration stream.  It also complicates fixing a
>> further race condition, between querying the number of MSRs for a vcpu, and
>> the vcpu touching a new one.
>>
>> As this code is still only in staging, take this opportunity to redesign the
> s/staging/unstable/

Oops - yes.

>
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -1341,6 +1341,132 @@ long arch_do_domctl(
>>      }
>>      break;
>>  
>> +    case XEN_DOMCTL_get_vcpu_msrs:
>> +    case XEN_DOMCTL_set_vcpu_msrs:
>> +    {
>> +        struct xen_domctl_vcpu_msrs *vmsrs = &domctl->u.vcpu_msrs;
>> +        struct xen_domctl_vcpu_msr msr;
>> +        struct vcpu *v;
>> +        uint32_t nr_msrs = 0;
>> +
>> +        ret = -ESRCH;
>> +        if ( (vmsrs->vcpu >= d->max_vcpus) ||
>> +             ((v = d->vcpu[vmsrs->vcpu]) == NULL) )
>> +            break;
>> +
>> +        ret = -EINVAL;
>> +        if ( (v == current) || /* no vcpu_pause() */
>> +             !is_pv_domain(d) )
>> +            break;
>> +
>> +        /* Count maximum number of optional msrs. */
>> +        if ( boot_cpu_has(X86_FEATURE_DBEXT) )
>> +            nr_msrs += 4;
>> +
>> +        if ( domctl->cmd == XEN_DOMCTL_get_vcpu_msrs )
>> +        {
>> +            /* NULL guest handle is a request for max size. */
>> +            if ( guest_handle_is_null(vmsrs->msrs) ||
>> +                 (vmsrs->msr_count < nr_msrs) )
>> +            {
>> +                vmsrs->msr_count = nr_msrs;
>> +                ret = guest_handle_is_null(vmsrs->msrs) ? 0 : -ENOBUFS;
> I don't think you should be failing "get" if there is enough space in
> the provided buffer to store the actually used number of MSRs. That
> way the caller may make a first call with a few (rather than none at
> all) entries, an grow the buffer only if this wasn't sufficient.

I am not sure I agree.  The MSRs are unordered in the buffer which the
caller cannot control, and issuing a hypercall again with a larger
buffer will rewrite it from the start again.

The sole use of this hypercall needs to ensure that all MSRs are gotten,
otherwise VM corruption will occur.  Permitting a partial get will make
the return value ambiguous for making this hypercall a single time and
guessing at the size to use, although I suspect we are less interested
in this problem.

>
>> +                }
>> +
>> +                vcpu_unpause(v);
>> +
>> +                /* Check we didn't lie to userspace then overflow the buffer */
>> +                BUG_ON(i > nr_msrs);
>> +                vmsrs->msr_count = i;
>> +            }
>> +
>> +            copyback = 1;
>> +        }
>> +        else
>> +        {
>> +            ret = -EINVAL;
>> +            if ( vmsrs->msr_count > nr_msrs )
>> +                break;
> Similarly I don't think you should fail the operation simply based on a
> too large count. Who knows when or why it may make sense for the
> caller to specify multiple entries with the same index.
>
> But then again the way you do it we have a statically determined
> maximum and don't need to worry about preemption here until a
> really large number of MSRs would get handled.

For now, this prevents looping to the end of a guest controlled
uint32_t, as per some of the concerns in XSA-77.

If in the future, there is a scenario where reloading the same MSR
several times becomes valid, the logic can be re-addressed.

>
>> +
>> +            vcpu_pause(v);
>> +
>> +            for ( i = 0; i < vmsrs->msr_count; ++i )
>> +            {
>> +                ret = -EFAULT;
>> +                if ( copy_from_guest_offset(&msr, vmsrs->msrs, i, 1) )
>> +                    break;
>> +
>> +                ret = -EINVAL;
>> +                if ( msr.reserved )
>> +                    break;
>> +
>> +                switch ( msr.index )
>> +                {
>> +                case MSR_AMD64_DR0_ADDRESS_MASK:
>> +                    if ( !boot_cpu_has(X86_FEATURE_DBEXT) ||
>> +                         (msr.value >> 32) )
>> +                        break;
>> +                    v->arch.pv_vcpu.dr_mask[0] = msr.value;
>> +                    continue;
>> +
>> +                case MSR_AMD64_DR1_ADDRESS_MASK ...
>> +                    MSR_AMD64_DR3_ADDRESS_MASK:
>> +                    if ( !boot_cpu_has(X86_FEATURE_DBEXT) ||
>> +                         (msr.value >> 32) )
>> +                        break;
>> +                    msr.index -= MSR_AMD64_DR1_ADDRESS_MASK - 1;
>> +                    v->arch.pv_vcpu.dr_mask[msr.index] = msr.value;
>> +                    continue;
>> +                }
>> +                break;
>> +            }
>> +
>> +            vcpu_unpause(v);
>> +
>> +            if ( i == vmsrs->msr_count )
>> +                ret = 0;
> else {
>     vmsrs->msr_count = i
>     copyback = 1;
> }
>
> to at once give the caller an indication which slot failed?
>
> Jan

Ok - seems useful as a debugging measure.

~Andrew

next prev parent reply	other threads:[~2014-06-05 13:01 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-04 17:26 [PATCH 0/4] Fixes to several domctls for migration Andrew Cooper
2014-06-04 17:26 ` [PATCH 1/4] x86/domctl: Implement XEN_DOMCTL_{get, set}_vcpu_msrs Andrew Cooper
2014-06-05 12:46   ` Jan Beulich
2014-06-05 13:01     ` Andrew Cooper [this message]
2014-06-05 13:33       ` Jan Beulich
2014-06-06 14:53         ` Andrew Cooper
2014-06-06 15:09           ` Jan Beulich
2014-06-06 15:28             ` Andrew Cooper
2014-06-04 17:26 ` [PATCH 2/4] tools/libxc: Use an explicit check for PV MSRs in xc_domain_save() Andrew Cooper
2014-06-05 13:41   ` Jan Beulich
2014-06-05 15:52   ` Ian Campbell
2014-06-05 15:57     ` Andrew Cooper
2014-06-06  9:15       ` Ian Campbell
2014-06-06  9:44         ` Andrew Cooper
2014-06-06  9:48           ` Ian Campbell
2014-06-04 17:26 ` [PATCH 3/4] x86/domctl: Remove PV MSR parts of XEN_DOMCTL_[gs]et_ext_vcpucontext Andrew Cooper
2014-06-05  7:52   ` Frediano Ziglio
2014-06-05  9:25     ` Andrew Cooper
2014-06-04 17:26 ` [PATCH 4/4] x86/domctl: Two functional fixes to XEN_DOMCTL_[gs]etvcpuextstate Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53906A39.3070506@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.