From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: RFC: xen config changes v4 Date: Fri, 27 Feb 2015 13:36:33 +0100 Message-ID: <54F064D1.8030604@suse.com> References: <20150226015305.GE8749@wotan.suse.de> <54EEA847.6070505@suse.com> <54EEF0A7.6060804@citrix.com> <20150226172925.GL8749@wotan.suse.de> <54F00A36.1060608@suse.com> <54F03F2D.8050209@suse.com> <54F05550.2030106@suse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1YRK9k-00065V-Qi for xen-devel@lists.xenproject.org; Fri, 27 Feb 2015 12:36:37 +0000 In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefano Stabellini Cc: Boris Ostrovsky , xen-devel@lists.xenproject.org, "Luis R. Rodriguez" , David Vrabel , Jan Beulich List-Id: xen-devel@lists.xenproject.org On 02/27/2015 01:24 PM, Stefano Stabellini wrote: > On Fri, 27 Feb 2015, Juergen Gross wrote: >> On 02/27/2015 11:11 AM, Stefano Stabellini wrote: >>> On Fri, 27 Feb 2015, Juergen Gross wrote: >>>> On 02/27/2015 10:41 AM, Stefano Stabellini wrote: >>>>> On Fri, 27 Feb 2015, Juergen Gross wrote: >>>>>> On 02/26/2015 06:42 PM, Stefano Stabellini wrote: >>>>>>> On Thu, 26 Feb 2015, Luis R. Rodriguez wrote: >>>>>>>> On Thu, Feb 26, 2015 at 11:08:20AM +0000, Stefano Stabellini >>>>>>>> wrote: >>>>>>>>> On Thu, 26 Feb 2015, David Vrabel wrote: >>>>>>>>>> On 26/02/15 04:59, Juergen Gross wrote: >>>>>>>>>>> >>>>>>>>>>> So we are again in the situation that pv-drivers always >>>>>>>>>>> imply >>>>>>>>>>> the >>>>>>>>>>> pvops >>>>>>>>>>> kernel (PARAVIRT selected). I started the whole Kconfig >>>>>>>>>>> rework >>>>>>>>>>> to >>>>>>>>>>> eliminate this dependency. >>>>>>>>>> >>>>>>>>>> Yes. Can you produce a series that just addresses this one >>>>>>>>>> issue. >>>>>>>>>> >>>>>>>>>> In the absence of any concrete requirement for this big >>>>>>>>>> Kconfig >>>>>>>>>> reorg >>>>>>>>>> I >>>>>>>>>> I don't think it is helpful. >>>>>>>>> >>>>>>>>> I clearly missed some context as I didn't realize that this was >>>>>>>>> the >>>>>>>>> intended goal. Why do we want this? Please explain as it won't >>>>>>>>> come >>>>>>>>> for free. >>>>>>>>> >>>>>>>>> >>>>>>>>> We have a few PV interfaces for HVM guests that need PARAVIRT in >>>>>>>>> Linux >>>>>>>>> in order to be used, for example pv_time_ops and >>>>>>>>> HVMOP_pagetable_dying. >>>>>>>>> They are critical performance improvements and from the >>>>>>>>> interface >>>>>>>>> perspective, small enough that doesn't make much sense having a >>>>>>>>> separate >>>>>>>>> KConfig option for them. >>>>>>>>> >>>>>>>>> >>>>>>>>> In order to reach the goal above we necessarily need to >>>>>>>>> introduce a >>>>>>>>> differentiation in terms of PV on HVM guests in Linux: >>>>>>>>> >>>>>>>>> 1) basic guests with PV network, disk, etc but no PV timers, no >>>>>>>>> HVMOP_pagetable_dying, no PV IPIs >>>>>>>>> 2) full PV on HVM guests that have PV network, disk, timers, >>>>>>>>> HVMOP_pagetable_dying, PV IPIs and anything else that >>>>>>>>> makes >>>>>>>>> sense. >>>>>>>>> >>>>>>>>> 2) is much faster than 1) on Xen and 2) is only a tiny bit >>>>>>>>> slower >>>>>>>>> than >>>>>>>>> 1) on native x86 >>>>>>>> >>>>>>>> Also don't we shove 2) down hvm guests right now? Even when >>>>>>>> everything >>>>>>>> is >>>>>>>> built in I do not see how we opt out for HVM for 1) at run time >>>>>>>> right >>>>>>>> now. >>>>>>>> >>>>>>>> If this is true then the question of motivation for this becomes >>>>>>>> even >>>>>>>> stronger I think. >>>>>>> >>>>>>> Yes, indeed there is no way to do 1) at the moment. And for good >>>>>>> reasons, see above. >>>>>> >>>>>> Hmm, after checking the code I'm not convinced: >>>>>> >>>>>> - HVMOP_pagetable_dying is obsolete on modern hardware supporting >>>>>> EPT/HAP >>>>> >>>>> That might be true, but what about older hardware? >>>>> Even on modern hardware a few workloads still run faster on shadow. >>>>> But if HVMOP_pagetable_dying is the only reason to keep PARAVIRT for HVM >>>>> guests, then I agree with you that we should remove it. >>>>> >>>>> >>>>>> - PV IPIs are not needed on single-vcpu guests >>>>>> >>>>>> - PARAVIRT_CLOCK doesn't need PARAVIRT (in fact the SUSEs kernel >>>>>> configs >>>>>> for all x86_64 kernels have CONFIG_PARAVIRT_CLOCK=y) >>>>>> >>>>>> So I think we really should enable building Xen frontends without >>>>>> PARAVIRT, implying at least no XEN_PV and no XEN_PVH. >>>>>> >>>>>> I'll have a try setting up patches. >>>>> >>>>> If we are doing this as a performance improvement, I would like to see a >>>>> couple of benchmarks (kernbench, hackbench) to show that on a >>>>> single-vcpu guest and multi-vcpu guest (let's say 4 vcpus) disabling >>>>> PARAVIRT leads to better performance on Xen on EPT hardware. >>>> >>>> This is not meant to be a performance improvement. It is meant to enable >>>> a standard distro kernel configured without PARAVIRT to be able to run >>>> as a HVM guest using the pv-drivers. >>> >>> This is not a convincing explanation. Debian, Ubuntu and Fedora seems >>> to be able to cope with it just fine. >>> >>> Why do you want to do that, even though it will cause a performance >>> regression and a maintenance pain? You haven't provided a reason yet. >>> >> >> Either we are talking about different things, or I really don't >> understand your problem here. I don't want to disable something. I >> just want to enable kernels without PARAVIRT to run under Xen better >> than today. Being it 32 bit non-PAE kernels as Ian pointed out or >> distro kernels like e.g. SLES and probably RHEL. >> >> Using PV frontends is completely orthogonal to other PV enhancements >> like PARAVIRT_CLOCK, HVMOP_pagetable_dying or PV IPIs. So why do you >> object enabling the PV frontends for those kernels? > > I am for it. I would like to avoid two user visible XEN enablement > options (XEN_FRONTEND vs. XEN_PVHVM) for x86_64 and PAE HVM guests to > avoid configurations with just XEN_FRONTEND, that can be considered a > performance regression compared to what we have now (on x86_64 and PAE). Would you be okay with making this an expert configuration alternative for PAE/x86_64? This would enable the possibility to use PV drivers for native-performance-tuned kernels. I would explicitly mention the better alternative XEN_PVHVM in the Kconfig help text. Juergen