From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH v14 for-xen-4.5 11/21] x86/VPMU: Interface for setting PMU mode and flags Date: Mon, 27 Oct 2014 14:52:21 -0400 Message-ID: <544E9465.6040500@oracle.com> References: <1413580689-2750-1-git-send-email-boris.ostrovsky@oracle.com> <1413580689-2750-12-git-send-email-boris.ostrovsky@oracle.com> <544E7FC402000078000428C8@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <544E7FC402000078000428C8@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: kevin.tian@intel.com, keir@xen.org, suravee.suthikulpanit@amd.com, andrew.cooper3@citrix.com, tim@xen.org, dietmar.hahn@ts.fujitsu.com, xen-devel@lists.xen.org, Aravind.Gopalakrishnan@amd.com, jun.nakajima@intel.com, dgdegra@tycho.nsa.gov List-Id: xen-devel@lists.xenproject.org On 10/27/2014 12:24 PM, Jan Beulich wrote: >>>> On 17.10.14 at 23:17, wrote: >> >> +static int vpmu_force_context_switch(void) >> +{ >> + unsigned i, numcpus, mycpu; >> + static s_time_t start; >> + struct vcpu *curr_vcpu = current; >> + static DEFINE_PER_CPU(struct tasklet *, sync_task); >> + int ret = 0; >> + >> + numcpus = num_online_cpus(); > I think you'd be better off counting these as you schedule the tasklets. > >> + mycpu = smp_processor_id(); >> + >> + if ( sync_vcpu != NULL ) /* if set, we may be in hypercall continuation */ >> + { >> + if (sync_vcpu != curr_vcpu ) > Coding style. > >> + /* We are not the original caller */ >> + return -EAGAIN; > That would seem to be the wrong return value then. Also, the HAP > side fix for XSA-97 taught us that identifying a caller by vCPU is > problematic - in the course of the retries the kernel's scheduler > may move the calling process to a different vCPU, yet it's still the > legitimate original caller. If the process is rescheduled then we will time out operation. I suppose I can set a bit in the argument's val to mark that particular argument as pending a continuation completion (I don't think we need to worry about malicious domain here since this is a privileged operation). > >> + goto cont_wait; >> + } >> + >> + for_each_online_cpu ( i ) >> + { >> + if ( i == mycpu ) >> + continue; >> + >> + per_cpu(sync_task, i) = xmalloc(struct tasklet); >> + if ( per_cpu(sync_task, i) == NULL ) >> + { >> + printk(XENLOG_WARNING "vpmu_force_context_switch: out of memory\n"); >> + ret = -ENOMEM; >> + goto out; >> + } >> + tasklet_init(per_cpu(sync_task, i), vpmu_sched_checkin, 0); >> + } >> + >> + /* First count is for self */ >> + atomic_set(&vpmu_sched_counter, 1); >> + >> + for_each_online_cpu ( i ) >> + { >> + if ( i != mycpu ) >> + tasklet_schedule_on_cpu(per_cpu(sync_task, i), i); >> + } >> + >> + vpmu_save(current); >> + >> + sync_vcpu = curr_vcpu; >> + start = NOW(); >> + >> + cont_wait: >> + /* >> + * Note that we may fail here if a CPU is hot-plugged while we are >> + * waiting. We will then time out. >> + */ > And I continue to miss the handling of the hot-unplug case (or at the > very least a note on this being unimplemented [and going to crash], > to at least clarify matters to the curious reader). Where would we crash? I have no interest in that. > >> + while ( atomic_read(&vpmu_sched_counter) != numcpus ) >> + { >> + s_time_t now; >> + >> + cpu_relax(); >> + >> + now = NOW(); >> + >> + /* Give up after (arbitrarily chosen) 5 seconds */ >> + if ( now > start + SECONDS(5) ) >> + { >> + printk(XENLOG_WARNING >> + "vpmu_force_context_switch: failed to sync\n"); >> + ret = -EBUSY; >> + break; >> + } >> + >> + if ( hypercall_preempt_check() ) >> + return hypercall_create_continuation( >> + __HYPERVISOR_xenpmu_op, "i", XENPMU_mode_set); > Did you test this code path? I don't see how with the missing second > hypercall argument the continuation could reliably succeed. I did test it (and retested it now) and it works. I guess it may be picking the same value from the stack during continuation which is why it does not fail. -boris