From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Matt Wilson <msw@amazon.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
"Keir (Xen.org)" <keir@xen.org>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
Date: Wed, 5 Dec 2012 16:06:39 +0000 [thread overview]
Message-ID: <50BF710F.2070407@citrix.com> (raw)
In-Reply-To: <20121205155906.GA32088@u109add4315675089e695.ant.amazon.com>
On 05/12/12 15:59, Matt Wilson wrote:
> On Wed, Dec 05, 2012 at 10:44:04AM +0000, Andrew Cooper wrote:
>> On 05/12/12 06:02, Matt Wilson wrote:
>>> An administrator may want Xen to pin dom0 vCPUs to pCPUs 1:1 at boot,
>>> but still have the flexibility to change the configuration later.
>>> There's no logic that keys off of domain->is_pinned outside of
>>> sched_init_vcpu() and vcpu_set_affinity(). By adjusting the
>>> is_pinned_vcpu() macro to only check for a single CPU set in the
>>> cpu_affinity mask, dom0 vCPUs can safely be re-pinned after the system
>>> boots.
>> Sadly this patch will break things. There are certain callers of
>> is_pinned_vcpu() which rely on the value to allow access to certain
>> power related MSRs, which is where the requirement for never permitting
>> an update of the affinity mask comes from.
> If this is true, the existing is_pinned_vcpu() test is broken:
>
> #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
> cpumask_weight((v)->cpu_affinity) == 1)
>
> It's && not ||. So if someone pins dom0 vCPUs to pCPUs 1:1 after boot,
> the MSR traps will suddenly start working.
>
> See commit: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=cc0854dd
>
>> When I encountered this problem before, I considered implementing
>> dom0_vcpu_pin=dynamic (or name to suit) which sets up an identity pin at
>> create time, but leaves is_pinned as false.
> I could implement that, but I want to make sure we're fixing a real
> problem. It sounds like Keir thinks this can be relaxed.
>
> Matt
Hmm - the logic does indeed look broken. When I was working on this
problem, it was an older tree without this changeset.
With the current logic, the vcpu will transparently move to a different
set of MSRs whenever the affinity is changed, which does not sound safe
for an unaware vcpu. I did not pay much attention to the function of
the MSRs.
~Andrew
>
>>> Signed-off-by: Matt Wilson <msw@amazon.com>
>>>
>>> diff -r 29247e44df47 -r 2614dd8be3a0 docs/misc/xen-command-line.markdown
>>> --- a/docs/misc/xen-command-line.markdown Fri Nov 30 21:51:17 2012 +0000
>>> +++ b/docs/misc/xen-command-line.markdown Wed Dec 05 05:48:23 2012 +0000
>>> @@ -453,7 +453,7 @@ Practices](http://wiki.xen.org/wiki/Xen_
>>>
>>> > Default: `false`
>>>
>>> -Pin dom0 vcpus to their respective pcpus
>>> +Initially pin dom0 vcpus to their respective pcpus
>>>
>>> ### e820-mtrr-clip
>>> > `= <boolean>`
>>> diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/domain.c
>>> --- a/xen/common/domain.c Fri Nov 30 21:51:17 2012 +0000
>>> +++ b/xen/common/domain.c Wed Dec 05 05:48:23 2012 +0000
>>> @@ -45,10 +45,6 @@
>>> /* xen_processor_pmbits: xen control Cx, Px, ... */
>>> unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX;
>>>
>>> -/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned. */
>>> -bool_t opt_dom0_vcpus_pin;
>>> -boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin);
>>> -
>>> /* Protect updates/reads (resp.) of domain_list and domain_hash. */
>>> DEFINE_SPINLOCK(domlist_update_lock);
>>> DEFINE_RCU_READ_LOCK(domlist_read_lock);
>>> @@ -235,7 +231,6 @@ struct domain *domain_create(
>>>
>>> if ( domid == 0 )
>>> {
>>> - d->is_pinned = opt_dom0_vcpus_pin;
>>> d->disable_migrate = 1;
>>> }
>>>
>>> diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/schedule.c
>>> --- a/xen/common/schedule.c Fri Nov 30 21:51:17 2012 +0000
>>> +++ b/xen/common/schedule.c Wed Dec 05 05:48:23 2012 +0000
>>> @@ -52,6 +52,11 @@ boolean_param("sched_smt_power_savings",
>>> * */
>>> int sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US;
>>> integer_param("sched_ratelimit_us", sched_ratelimit_us);
>>> +
>>> +/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned at boot. */
>>> +bool_t opt_dom0_vcpus_pin;
>>> +boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin);
>>> +
>>> /* Various timer handlers. */
>>> static void s_timer_fn(void *unused);
>>> static void vcpu_periodic_timer_fn(void *data);
>>> @@ -194,7 +199,8 @@ int sched_init_vcpu(struct vcpu *v, unsi
>>> * domain-0 VCPUs, are pinned onto their respective physical CPUs.
>>> */
>>> v->processor = processor;
>>> - if ( is_idle_domain(d) || d->is_pinned )
>>> +
>>> + if ( is_idle_domain(d) || (d->domain_id == 0 && opt_dom0_vcpus_pin) )
>>> cpumask_copy(v->cpu_affinity, cpumask_of(processor));
>>> else
>>> cpumask_setall(v->cpu_affinity);
>>> @@ -595,8 +601,6 @@ int vcpu_set_affinity(struct vcpu *v, co
>>> cpumask_t online_affinity;
>>> cpumask_t *online;
>>>
>>> - if ( v->domain->is_pinned )
>>> - return -EINVAL;
>>> online = VCPU2ONLINE(v);
>>> cpumask_and(&online_affinity, affinity, online);
>>> if ( cpumask_empty(&online_affinity) )
>>> diff -r 29247e44df47 -r 2614dd8be3a0 xen/include/xen/sched.h
>>> --- a/xen/include/xen/sched.h Fri Nov 30 21:51:17 2012 +0000
>>> +++ b/xen/include/xen/sched.h Wed Dec 05 05:48:23 2012 +0000
>>> @@ -292,8 +292,6 @@ struct domain
>>> enum { DOMDYING_alive, DOMDYING_dying, DOMDYING_dead } is_dying;
>>> /* Domain is paused by controller software? */
>>> bool_t is_paused_by_controller;
>>> - /* Domain's VCPUs are pinned 1:1 to physical CPUs? */
>>> - bool_t is_pinned;
>>>
>>> /* Are any VCPUs polling event channels (SCHEDOP_poll)? */
>>> #if MAX_VIRT_CPUS <= BITS_PER_LONG
>>> @@ -713,8 +711,7 @@ void watchdog_domain_destroy(struct doma
>>>
>>> #define is_hvm_domain(d) ((d)->is_hvm)
>>> #define is_hvm_vcpu(v) (is_hvm_domain(v->domain))
>>> -#define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
>>> - cpumask_weight((v)->cpu_affinity) == 1)
>>> +#define is_pinned_vcpu(v) (cpumask_weight((v)->cpu_affinity) == 1)
>>> #ifdef HAS_PASSTHROUGH
>>> #define need_iommu(d) ((d)->need_iommu)
>>> #else
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
>> --
>> Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
>> T: +44 (0)1223 225 900, http://www.citrix.com
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
next prev parent reply other threads:[~2012-12-05 16:06 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-05 6:02 [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set Matt Wilson
2012-12-05 10:44 ` Andrew Cooper
2012-12-05 15:59 ` Matt Wilson
2012-12-05 16:04 ` Matt Wilson
2012-12-05 16:06 ` Andrew Cooper [this message]
2012-12-05 16:25 ` Jan Beulich
2012-12-05 17:06 ` Matt Wilson
2012-12-05 17:16 ` George Dunlap
2012-12-06 10:44 ` Jan Beulich
2012-12-10 22:01 ` Matt Wilson
2012-12-11 10:43 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50BF710F.2070407@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=George.Dunlap@eu.citrix.com \
--cc=keir@xen.org \
--cc=msw@amazon.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).