xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>, xen-devel@lists.xen.org
Cc: wei.liu2@citrix.com, ian.jackson@eu.citrix.com,
	jbeulich@suse.com, roger.pau@citrix.com
Subject: Re: [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
Date: Mon, 14 Nov 2016 17:17:26 +0000	[thread overview]
Message-ID: <eba674e2-d384-0fc7-0f41-3b2d1f50d1f6@citrix.com> (raw)
In-Reply-To: <25c8e7b5-154a-c6a2-e7c6-900cdd41ffcc@oracle.com>

On 14/11/16 14:59, Boris Ostrovsky wrote:
> On 11/09/2016 02:47 PM, Boris Ostrovsky wrote:
>> On 11/09/2016 02:23 PM, Andrew Cooper wrote:
>>> On 09/11/16 15:29, Boris Ostrovsky wrote:
>>>> On 11/09/2016 10:04 AM, Andrew Cooper wrote:
>>>>> On 09/11/16 14:39, Boris Ostrovsky wrote:
>>>>>> This domctl is called when a VCPU is hot-(un)plugged to a guest (via
>>>>>> 'xl vcpu-set'). While this currently is only intended to be needed by
>>>>>> PVH guests we will call this domctl for all (x86) guests for consistency.
>>>>>>
>>>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>>>> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
>>>>>> ---
>>>>>> Changes in v2:
>>>>>> * Added comment in arch_do_domctl() stating that the domctl is only required
>>>>>>   for PVH guests
>>>>> I am not happy with this change until we understand why it is needed.
>>>>>
>>>>> Are we genuinely saying that there is no current enforcement in the
>>>>> PV-hotplug mechanism?
>>>> That's right. Don't call setup_cpu_watcher() in Linux and you will be
>>>> running with maxvcpus.
>>> /sigh - Quality engineering there...
>>>
>>> Yes - lets take the time to actually do this properly.
>>>
>>>>>> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
>>>>>> index 2a2fe04..b736d4c 100644
>>>>>> --- a/xen/arch/x86/domctl.c
>>>>>> +++ b/xen/arch/x86/domctl.c
>>>>>> @@ -1430,6 +1430,23 @@ long arch_do_domctl(
>>>>>>          }
>>>>>>          break;
>>>>>>  
>>>>>> +    case XEN_DOMCTL_set_avail_vcpus:
>>>>>> +    {
>>>>>> +        unsigned int num = domctl->u.avail_vcpus.num;
>>>>>> +
>>>>>> +        /*
>>>>>> +         * This is currently only needed by PVH guests (but
>>>>>> +         * any guest is free to make this call).
>>>>>> +         */
>>>>>> +        ret = -EINVAL;
>>>>>> +        if ( num > d->max_vcpus )
>>>>>> +            break;
>>>>>> +
>>>>>> +        d->arch.avail_vcpus = num;
>>>>>> +        ret = 0;
>>>>>> +        break;
>>>>>> +    }
>>>>> What do you actually mean by "avail_vcpus"?  What happens if a vcpu
>>>>> higher than the new number is currently online and running?  What
>>>>> happens to the already-existing vcpus-at-startup value?
>>>> It shouldn't happen: we set avail_vcpus at domain creation time to
>>>> vcpus-at-startup.
>>>>
>>>> The name is not great. It would have been better to have it online_vcpus
>>>> but that usually means that VPF_down is set (which can happen, for
>>>> example, when the guest offlines a VCPU).
>>> How about an availability bitmap instead, which always has max_vcpus
>>> bits in it?  Xen should consult the bitmap before allowing a VM to
>>> online a new vcpu.
>> We could indeed use bitmap (and then it will actually be easier to
>> handle io request as we can just copy appropriate bytes of the map
>> instead of going bit-by-bit). This will still require the new domctl.

Given the lack of any enforcing mechanism, introducing one is clearly
needed.  Please mention so in the commit message.

>>
>> I am not convinced though that we can start enforcing this new VCPU
>> count, at least for PV guests. They expect to start all max VCPUs and
>> then offline them. This, of course, can be fixed but all non-updated
>> kernels will stop booting.
>
> How about we don't clear _VPF_down if the bit in the availability bitmap
> is not set?

This is yet another PV mess.  We clearly need to quirk PV guests as the
exception to sanity, given that they expect (and have been able to)
online all cpus at start-of-day.

To avoid race conditions, you necessarily need to be able to set a
reduced permitted map before asking the VM to unplug.

For HVM guests, we can set a proper permitted map at boot, and really
should do so.

For PV guests, we have to wait until it has completed its SMP bringup
before reducing the permitted set.  Therefore, making the initial
set_avail_vcpus call could be deferred until the first unplug request?

It also occurs to me that you necessarily need a get_avail_vcpus
hypercall to be able to use this interface sensibly from the toolstack.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-11-14 17:17 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-09 14:39 [PATCH v2 00/11] PVH VCPU hotplug support Boris Ostrovsky
2016-11-09 14:39 ` [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus Boris Ostrovsky
2016-11-09 15:04   ` Andrew Cooper
2016-11-09 15:29     ` Boris Ostrovsky
2016-11-09 19:23       ` Andrew Cooper
2016-11-09 19:47         ` Boris Ostrovsky
2016-11-14 14:59           ` Boris Ostrovsky
2016-11-14 17:17             ` Andrew Cooper [this message]
2016-11-14 17:48               ` Boris Ostrovsky
2016-11-14 18:19                 ` Andrew Cooper
2016-11-14 18:44                   ` Boris Ostrovsky
2016-11-15 16:41                     ` Andrew Cooper
2016-11-15 17:04                       ` Boris Ostrovsky
2016-11-15 17:30                         ` Andrew Cooper
2016-11-15  8:34   ` Jan Beulich
2016-11-15 14:28     ` Boris Ostrovsky
2016-11-15 14:33       ` Jan Beulich
2016-11-15 15:00         ` Boris Ostrovsky
2016-11-09 14:39 ` [PATCH v2 02/11] acpi: Define ACPI IO registers for PVH guests Boris Ostrovsky
2016-11-09 14:48   ` Julien Grall
2016-11-09 14:54     ` Boris Ostrovsky
2016-11-09 14:48   ` Paul Durrant
2016-11-09 14:59   ` Andrew Cooper
2016-11-09 15:14     ` Boris Ostrovsky
2016-11-09 19:58       ` Andrew Cooper
2016-11-09 21:01         ` Boris Ostrovsky
2016-11-14 15:01           ` Boris Ostrovsky
2016-11-14 17:19             ` Andrew Cooper
2016-11-15  8:47   ` Jan Beulich
2016-11-15 14:47     ` Boris Ostrovsky
2016-11-15 15:13       ` Jan Beulich
2016-11-15 15:41         ` Boris Ostrovsky
2016-11-15 15:53           ` Jan Beulich
2016-11-15 16:23             ` Boris Ostrovsky
2016-11-15 16:33               ` Jan Beulich
2016-11-15 16:58                 ` Boris Ostrovsky
2016-11-15 16:59                   ` Jan Beulich
2016-11-15 18:31                   ` Andrew Cooper
2016-11-09 14:39 ` [PATCH v2 03/11] pvh: Set online VCPU map to avail_vcpus Boris Ostrovsky
2016-11-11 19:57   ` Konrad Rzeszutek Wilk
2016-11-12 15:40     ` Wei Liu
2016-11-09 14:39 ` [PATCH v2 04/11] acpi: Make pmtimer optional in FADT Boris Ostrovsky
2016-11-15  8:49   ` Jan Beulich
2016-11-09 14:39 ` [PATCH v2 05/11] acpi: Power and Sleep ACPI buttons are not emulated for PVH guests Boris Ostrovsky
2016-11-11 19:56   ` Konrad Rzeszutek Wilk
2016-11-15  8:54   ` Jan Beulich
2016-11-09 14:39 ` [PATCH v2 06/11] acpi: PVH guests need _E02 method Boris Ostrovsky
2016-11-11 19:58   ` Konrad Rzeszutek Wilk
2016-11-15  8:57   ` Jan Beulich
2016-11-09 14:39 ` [PATCH v2 07/11] pvh/ioreq: Install handlers for ACPI-related PVH IO accesses Boris Ostrovsky
2016-11-09 14:56   ` Paul Durrant
2016-11-11 20:01   ` Konrad Rzeszutek Wilk
2016-11-15  9:04   ` Jan Beulich
2016-11-09 14:39 ` [PATCH v2 08/11] pvh/acpi: Handle ACPI accesses for PVH guests Boris Ostrovsky
2016-11-09 14:58   ` Paul Durrant
2016-11-11 20:07   ` Konrad Rzeszutek Wilk
2016-11-15  9:24   ` Jan Beulich
2016-11-15 14:55     ` Boris Ostrovsky
2016-11-15 15:17       ` Jan Beulich
2016-11-15 15:44         ` Boris Ostrovsky
2016-11-15 15:56           ` Jan Beulich
2016-11-15 19:19             ` Andrew Cooper
2016-11-15 19:38               ` Boris Ostrovsky
2016-11-15 20:07                 ` Andrew Cooper
2016-11-15 20:20                   ` Boris Ostrovsky
2016-11-17  0:00                     ` Boris Ostrovsky
2016-11-17  0:08                       ` Andrew Cooper
2016-11-16  9:23               ` Jan Beulich
2016-11-09 14:39 ` [PATCH v2 09/11] events/x86: Define SCI virtual interrupt Boris Ostrovsky
2016-11-15  9:29   ` Jan Beulich
2016-11-09 14:39 ` [PATCH v2 10/11] pvh: Send an SCI on VCPU hotplug event Boris Ostrovsky
2016-11-15  9:36   ` Jan Beulich
2016-11-15 14:57     ` Boris Ostrovsky
2016-11-15 15:18       ` Jan Beulich
2016-11-15  9:38   ` Jan Beulich
2016-11-09 14:39 ` [PATCH v2 11/11] docs: Describe PVHv2's VCPU hotplug procedure Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eba674e2-d384-0fc7-0f41-3b2d1f50d1f6@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).