From mboxrd@z Thu Jan  1 00:00:00 1970
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: [PATCH v14 for-xen-4.5 11/21] x86/VPMU: Interface
 for setting PMU mode and flags
Date: Wed, 29 Oct 2014 10:22:31 -0400
Message-ID: <5450F827.4080509@oracle.com>
References: <1413580689-2750-1-git-send-email-boris.ostrovsky@oracle.com>
	<1413580689-2750-12-git-send-email-boris.ostrovsky@oracle.com>
	<544E7FC402000078000428C8@mail.emea.novell.com>
	<544E9465.6040500@oracle.com>
	<544F62050200007800042B11@mail.emea.novell.com>
	<544FCAC6.2050908@oracle.com>
	<5450B00A0200007800042F7B@mail.emea.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <5450B00A0200007800042F7B@mail.emea.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: kevin.tian@intel.com, keir@xen.org, suravee.suthikulpanit@amd.com, andrew.cooper3@citrix.com, tim@xen.org, dietmar.hahn@ts.fujitsu.com, xen-devel@lists.xen.org, Aravind.Gopalakrishnan@amd.com, jun.nakajima@intel.com, dgdegra@tycho.nsa.gov
List-Id: xen-devel@lists.xenproject.org

On 10/29/2014 04:14 AM, Jan Beulich wrote:
>>>> On 28.10.14 at 17:56, <boris.ostrovsky@oracle.com> wrote:
>> On 10/28/2014 04:29 AM, Jan Beulich wrote:
>>> Also "I can set a bit in ..." is too vague to say whether that would end
>>> up being an acceptable approach. The rationale behind the final
>>> behavior we gave the XSA-97 fix was that if the operation is privileged
>>> enough, it is okay for any vCPU of the originating domain to continue
>>> the current one (including the non-determinism of which of them will
>>> see the final successful completion of the hypercall, should more than
>>> one of them race). I think you ought to follow that model here and
>>> store/check the domain rather than the vCPU, in which case I don't
>>> think you'll need any extra bit(s).
>> I am not sure just keeping domainID is sufficient in this case. True, it
>> doesn't matter which VCPU completes the operation but what we want to
>> avoid is to have two simultaneous (and possibly different) requests from
>> the same domain. If we keep it as some sort of a static variable (like I
>> do now with sync_vcpu) then it will be difficult to distinguish which
>> request is the continuation and which is a new one.
> A static variable may indeed be insufficient here. Did you look at
> the XSA-97 change at all, trying to mirror its logic here?

You mean storing this in domain structure? I don't want to add new 
fields to such a common structure for an operation that is exceedingly 
inferquent.

>
>> What I was suggesting is keeping some sort of state in the hypercall
>> argument making it unique to the call. I said "a bit" but it can be, for
>> example, setting the pad value in xen_pmu_params to some cookie
>> (although that's probably not a particularly good idea since then the
>> caller/domain would have to clear it before making the hypercall). So,
>> if we set, say, the upper bit in xen_pmu_params.val before creating
>> continuation then when we come back we will know for sure that this is
>> indeed the continuation and not a new call.
> Whatever state in the hypercall arguments you alter, a malicious or
> buggy caller could do the same to an original request.
>
> However, I wonder whether a model without continuations (and
> hence not along the lines of what we did for XSA-97) might not be
> better here after all:
>
> 1) Considering that you don't need access to the hypercall
> arguments after initial evaluation, continue_hypercall_on_cpu()
> would seem usable here: Once you visited all CPUs, you can be
> certain a context switch occurred everywhere.
>
> 2) You could pause the current vCPU after scheduling all tasklets
> and have the last one unpause it and do the necessary cleanup.

This sounds simpler than what I have now.

I don't think I will need the tasklets with this approach: they are all 
part of continue_hypercall_on_cpu()?

As for pausing the VCPU? Won't the continue_hypercall_on_cpu() keep it 
asleep until everyone has completed?


>
>>>>>> + cont_wait:
>>>>>> +    /*
>>>>>> +     * Note that we may fail here if a CPU is hot-plugged while we are
>>>>>> +     * waiting. We will then time out.
>>>>>> +     */
>>>>> And I continue to miss the handling of the hot-unplug case (or at the
>>>>> very least a note on this being unimplemented [and going to crash],
>>>>> to at least clarify matters to the curious reader).
>>>> Where would we crash? I have no interest in that.
>>> per_cpu() accesses are invalid for offline CPUs.
>> Right.
>>
>> How about I get/put_cpu_maps() to prevent hotplug/unplug while we are
>> doing this?
> That's more the last resort solution. I'd prefer if you made your loops
> simply a little more careful. Remember that hot-unplug can't happen
> while your code is executing, it can only hit while you are awaiting a
> continuation to occur.

I didn't realize that. But let me try what you suggested above.

-boris