All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: kevin.tian@intel.com, keir@xen.org,
	suravee.suthikulpanit@amd.com, andrew.cooper3@citrix.com,
	tim@xen.org, dietmar.hahn@ts.fujitsu.com,
	xen-devel@lists.xen.org, Aravind.Gopalakrishnan@amd.com,
	jun.nakajima@intel.com, dgdegra@tycho.nsa.gov
Subject: Re: [PATCH v14 for-xen-4.5 11/21] x86/VPMU: Interface for setting PMU mode and flags
Date: Tue, 28 Oct 2014 12:56:38 -0400	[thread overview]
Message-ID: <544FCAC6.2050908@oracle.com> (raw)
In-Reply-To: <544F62050200007800042B11@mail.emea.novell.com>

On 10/28/2014 04:29 AM, Jan Beulich wrote:
>>>> On 27.10.14 at 19:52, <boris.ostrovsky@oracle.com> wrote:
>> On 10/27/2014 12:24 PM, Jan Beulich wrote:
>>>>>> On 17.10.14 at 23:17, <boris.ostrovsky@oracle.com> wrote:
>>>> +    mycpu = smp_processor_id();
>>>> +
>>>> +    if ( sync_vcpu != NULL ) /* if set, we may be in hypercall continuation */
>>>> +    {
>>>> +        if (sync_vcpu != curr_vcpu )
>>>> +            /* We are not the original caller */
>>>> +            return -EAGAIN;
>>> That would seem to be the wrong return value then. Also, the HAP
>>> side fix for XSA-97 taught us that identifying a caller by vCPU is
>>> problematic - in the course of the retries the kernel's scheduler
>>> may move the calling process to a different vCPU, yet it's still the
>>> legitimate original caller.
>> If the process is rescheduled then we will time out operation.
> And potentially never be able to complete it. Not an acceptable
> model imo.
>
>> I suppose I can set a bit in the argument's val to mark that particular
>> argument as pending a continuation completion (I don't think we need to
>> worry about malicious domain here since this is a privileged operation).
> Privileged in the sense that it (conceptually) will always be restricted
> to the hardware domain (or one granted equivalent privileges)?
> Please don't forget about disaggregation - newly added code will not
> get exceptions granted along the lines of XSA-77.
>
> Also "I can set a bit in ..." is too vague to say whether that would end
> up being an acceptable approach. The rationale behind the final
> behavior we gave the XSA-97 fix was that if the operation is privileged
> enough, it is okay for any vCPU of the originating domain to continue
> the current one (including the non-determinism of which of them will
> see the final successful completion of the hypercall, should more than
> one of them race). I think you ought to follow that model here and
> store/check the domain rather than the vCPU, in which case I don't
> think you'll need any extra bit(s).

I am not sure just keeping domainID is sufficient in this case. True, it 
doesn't matter which VCPU completes the operation but what we want to 
avoid is to have two simultaneous (and possibly different) requests from 
the same domain. If we keep it as some sort of a static variable (like I 
do now with sync_vcpu) then it will be difficult to distinguish which 
request is the continuation and which is a new one.

What I was suggesting is keeping some sort of state in the hypercall 
argument making it unique to the call. I said "a bit" but it can be, for 
example, setting the pad value in xen_pmu_params to some cookie 
(although that's probably not a particularly good idea since then the 
caller/domain would have to clear it before making the hypercall). So, 
if we set, say, the upper bit in xen_pmu_params.val before creating 
continuation then when we come back we will know for sure that this is 
indeed the continuation and not a new call.

The comment about privileged domain was to mean that we don't need to 
worry that the caller may maliciously try setting this bit in the hope 
of causing trouble. The caller can only be a privileged guest (even if 
it is the "disaggregated" guest that is responsible for VPMU management) 
and if it does it --- well, it will break VPMUs but nothing else. As far 
as I can tell, that is.

>
>>>> +        goto cont_wait;
>>>> +    }
>>>> +
>>>> +    for_each_online_cpu ( i )
>>>> +    {
>>>> +        if ( i == mycpu )
>>>> +            continue;
>>>> +
>>>> +        per_cpu(sync_task, i) = xmalloc(struct tasklet);
>>>> +        if ( per_cpu(sync_task, i) == NULL )
>>>> +        {
>>>> +            printk(XENLOG_WARNING "vpmu_force_context_switch: out of memory\n");
>>>> +            ret = -ENOMEM;
>>>> +            goto out;
>>>> +        }
>>>> +        tasklet_init(per_cpu(sync_task, i), vpmu_sched_checkin, 0);
>>>> +    }
>>>> +
>>>> +    /* First count is for self */
>>>> +    atomic_set(&vpmu_sched_counter, 1);
>>>> +
>>>> +    for_each_online_cpu ( i )
>>>> +    {
>>>> +        if ( i != mycpu )
>>>> +            tasklet_schedule_on_cpu(per_cpu(sync_task, i), i);
>>>> +    }
>>>> +
>>>> +    vpmu_save(current);
>>>> +
>>>> +    sync_vcpu = curr_vcpu;
>>>> +    start = NOW();
>>>> +
>>>> + cont_wait:
>>>> +    /*
>>>> +     * Note that we may fail here if a CPU is hot-plugged while we are
>>>> +     * waiting. We will then time out.
>>>> +     */
>>> And I continue to miss the handling of the hot-unplug case (or at the
>>> very least a note on this being unimplemented [and going to crash],
>>> to at least clarify matters to the curious reader).
>> Where would we crash? I have no interest in that.
> per_cpu() accesses are invalid for offline CPUs.


Right.

How about I get/put_cpu_maps() to prevent hotplug/unplug while we are 
doing this?


>
>>>> +    while ( atomic_read(&vpmu_sched_counter) != numcpus )
>>>> +    {
>>>> +        s_time_t now;
>>>> +
>>>> +        cpu_relax();
>>>> +
>>>> +        now = NOW();
>>>> +
>>>> +        /* Give up after (arbitrarily chosen) 5 seconds */
>>>> +        if ( now > start + SECONDS(5) )
>>>> +        {
>>>> +            printk(XENLOG_WARNING
>>>> +                   "vpmu_force_context_switch: failed to sync\n");
>>>> +            ret = -EBUSY;
>>>> +            break;
>>>> +        }
>>>> +
>>>> +        if ( hypercall_preempt_check() )
>>>> +            return hypercall_create_continuation(
>>>> +                __HYPERVISOR_xenpmu_op, "i", XENPMU_mode_set);
>>> Did you test this code path? I don't see how with the missing second
>>> hypercall argument the continuation could reliably succeed.
>> I did test it (and retested it now) and it works. I guess it may be
>> picking the same value from the stack during continuation which is why
>> it does not fail.
> Oh, right, hypercall argument clobbering (in debug builds) gets
> skipped for continuations (and no clobbering is being done for
> HVM/PVH at all). But I don't think you should rely on this, i.e. the
> invocation above should get fixed in any event.


Of course. I was simply saying why the tests passed, not why it should 
stay this way.

-boris

  reply	other threads:[~2014-10-28 16:56 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-17 21:17 [PATCH v14 for-xen-4.5 00/21] x86/PMU: Xen PMU PV(H) support Boris Ostrovsky
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 01/21] common/symbols: Export hypervisor symbols to privileged guest Boris Ostrovsky
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 02/21] x86/VPMU: Manage VPMU_CONTEXT_SAVE flag in vpmu_save_force() Boris Ostrovsky
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 03/21] x86/VPMU: Set MSR bitmaps only for HVM/PVH guests Boris Ostrovsky
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 04/21] x86/VPMU: Make vpmu macros a bit more efficient Boris Ostrovsky
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 05/21] intel/VPMU: Clean up Intel VPMU code Boris Ostrovsky
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 06/21] vmx: Merge MSR management routines Boris Ostrovsky
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 07/21] x86/VPMU: Handle APIC_LVTPC accesses Boris Ostrovsky
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 08/21] intel/VPMU: MSR_CORE_PERF_GLOBAL_CTRL should be initialized to zero Boris Ostrovsky
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 09/21] x86/VPMU: Add public xenpmu.h Boris Ostrovsky
2014-10-24 16:00   ` Jan Beulich
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 10/21] x86/VPMU: Make vpmu not HVM-specific Boris Ostrovsky
2014-10-17 21:17 ` [PATCH v14 for-xen-4.5 11/21] x86/VPMU: Interface for setting PMU mode and flags Boris Ostrovsky
2014-10-27 16:24   ` Jan Beulich
2014-10-27 18:52     ` Boris Ostrovsky
2014-10-28  8:29       ` Jan Beulich
2014-10-28 16:56         ` Boris Ostrovsky [this message]
2014-10-29  8:14           ` Jan Beulich
2014-10-29 14:22             ` Boris Ostrovsky
2014-10-29 16:50               ` Jan Beulich
2014-10-17 21:18 ` [PATCH v14 for-xen-4.5 12/21] x86/VPMU: Initialize AMD and Intel VPMU with __initcall Boris Ostrovsky
2014-10-17 21:18 ` [PATCH v14 for-xen-4.5 13/21] x86/VPMU: Initialize PMU for PV(H) guests Boris Ostrovsky
2014-10-27 16:38   ` Jan Beulich
2014-10-27 19:21     ` Boris Ostrovsky
2014-10-17 21:18 ` [PATCH v14 for-xen-4.5 14/21] x86/VPMU: Save VPMU state for PV guests during context switch Boris Ostrovsky
2014-10-17 21:18 ` [PATCH v14 for-xen-4.5 15/21] x86/VPMU: When handling MSR accesses, leave fault injection to callers Boris Ostrovsky
2014-10-17 21:18 ` [PATCH v14 for-xen-4.5 16/21] x86/VPMU: Add support for PMU register handling on PV guests Boris Ostrovsky
2014-10-17 21:18 ` [PATCH v14 for-xen-4.5 17/21] x86/VPMU: Handle PMU interrupts for " Boris Ostrovsky
2014-10-27 16:54   ` Jan Beulich
2014-10-27 19:43     ` Boris Ostrovsky
2014-10-28  9:30       ` Jan Beulich
2014-10-28 17:08         ` Boris Ostrovsky
2014-10-29  8:19           ` Jan Beulich
2014-10-17 21:18 ` [PATCH v14 for-xen-4.5 18/21] x86/VPMU: Merge vpmu_rdmsr and vpmu_wrmsr Boris Ostrovsky
2014-10-17 21:18 ` [PATCH v14 for-xen-4.5 19/21] x86/VPMU: Add privileged PMU mode Boris Ostrovsky
2014-10-17 21:18 ` [PATCH v14 for-xen-4.5 20/21] x86/VPMU: NMI-based VPMU support Boris Ostrovsky
2014-10-28 10:51   ` Jan Beulich
2014-10-17 21:18 ` [PATCH v14 for-xen-4.5 21/21] x86/VPMU: Move VPMU files up from hvm/ directory Boris Ostrovsky
2014-10-28 10:52   ` Jan Beulich
2014-10-27  7:38 ` [PATCH v14 for-xen-4.5 00/21] x86/PMU: Xen PMU PV(H) support Dietmar Hahn
2014-10-27 13:47   ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=544FCAC6.2050908@oracle.com \
    --to=boris.ostrovsky@oracle.com \
    --cc=Aravind.Gopalakrishnan@amd.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=dietmar.hahn@ts.fujitsu.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.