From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753237AbcDAJD2 (ORCPT ); Fri, 1 Apr 2016 05:03:28 -0400 Received: from mx2.suse.de ([195.135.220.15]:40540 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750701AbcDAJD0 (ORCPT ); Fri, 1 Apr 2016 05:03:26 -0400 Subject: Re: [PATCH v3 5/6] virt, sched: add cpu pinning to smp_call_sync_on_phys_cpu() To: Peter Zijlstra References: <1459494874-12194-1-git-send-email-jgross@suse.com> <1459494874-12194-6-git-send-email-jgross@suse.com> <20160401074325.GC12845@twins.programming.kicks-ass.net> <56FE313E.9060804@suse.com> <20160401084408.GF3448@twins.programming.kicks-ass.net> Cc: linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, david.vrabel@citrix.com, mingo@redhat.com, Douglas_Warzecha@dell.com, pali.rohar@gmail.com, jdelvare@suse.com, linux@roeck-us.net, tglx@linutronix.de, hpa@zytor.com, jeremy@goop.org, chrisw@sous-sol.org, akataria@vmware.com, rusty@rustcorp.com.au, virtualization@lists.linux-foundation.org, x86@kernel.org From: Juergen Gross Message-ID: <56FE3959.2030508@suse.com> Date: Fri, 1 Apr 2016 11:03:21 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 MIME-Version: 1.0 In-Reply-To: <20160401084408.GF3448@twins.programming.kicks-ass.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/04/16 10:44, Peter Zijlstra wrote: > On Fri, Apr 01, 2016 at 10:28:46AM +0200, Juergen Gross wrote: >> On 01/04/16 09:43, Peter Zijlstra wrote: >>> On Fri, Apr 01, 2016 at 09:14:33AM +0200, Juergen Gross wrote: >>>> --- a/kernel/smp.c >>>> +++ b/kernel/smp.c >>>> @@ -14,6 +14,7 @@ >>>> #include >>>> #include >>>> #include >>>> +#include >>>> >>>> #include "smpboot.h" >>>> >>>> @@ -758,9 +759,14 @@ struct smp_sync_call_struct { >>>> static void smp_call_sync_callback(struct work_struct *work) >>>> { >>>> struct smp_sync_call_struct *sscs; >>>> + unsigned int cpu; >>>> >>>> sscs = container_of(work, struct smp_sync_call_struct, work); >>>> + cpu = get_cpu(); >>>> + hypervisor_pin_vcpu(cpu); >>>> sscs->ret = sscs->func(sscs->data); >>>> + hypervisor_pin_vcpu(-1); >>>> + put_cpu(); >>>> >>>> complete(&sscs->done); >>>> } >>> >>> So I don't really like this; it adds the requirement that the function >>> cannot schedule, which greatly limits the utility of the construct. At >>> this point you might as well use the regular IPI stuff. >> >> Main reason for disabling preemption was to avoid any suspend/resume >> cycles while vcpu pinning is active. >> >> With the switch to workqueues this might not be necessary, if I've read >> try_to_freeze_tasks() correctly. Can you confirm, please? > > This is not something we should worry about; the caller should ensure > the CPU stays valid; typically I would expect a caller to do > get_online_cpus() before 'computing' what CPU to send the function to. Okay. > >>> So I would propose you add: >>> >>> smp_call_on_cpu() >>> >>> As per patch 2. No promises about physical or anything. This means it >>> can be used freely by anyone that wants to run a function on another >>> cpu -- a much more useful thing. >> >> Okay. >> >>> And then build a phys variant on top. >> >> Hmm, I'm not sure I understand what you are suggesting here. >> >> Should this phys variant make use of smp_call_on_cpu() via an >> intermediate function called on the dedicated cpu which is doing the >> pinning and calling the user function then? >> >> Or do you want the phys variant to either use smp_call_on_cpu() or to >> do the pinning and call the user function by itself depending on the >> environment (pinning supported)? > > Yeah, uhmm.. not sure on the details; my brain is having a hard time > engaging this morning. > > Maybe just make the vpin thing an option like: > > smp_call_on_cpu(int (*func)(void *), int phys_cpu); Okay. > Also; is something like the vpin thing possible on KVM? because if we're > going to expose it to generic code like this we had maybe look at wider > support. It is necessary for dom0 under Xen. I don't think there is a need to do this on KVM as a guest has no direct access to e.g. BIOS functions of the real hardware and the host system needs no vcpu pinning. I'm not sure about VMWare. Juergen