From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S934214AbcBDTy2 (ORCPT <rfc822;w@1wt.eu>);
	Thu, 4 Feb 2016 14:54:28 -0500
Received: from aserp1040.oracle.com ([141.146.126.69]:17777 "EHLO
	aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S932469AbcBDTy0 (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 4 Feb 2016 14:54:26 -0500
Subject: Re: [PATCH v2 02/11] xen/hvmlite: Bootstrap HVMlite guest
To: "Luis R. Rodriguez" <mcgrof@suse.com>
References: <1454341137-14110-1-git-send-email-boris.ostrovsky@oracle.com>
 <1454341137-14110-3-git-send-email-boris.ostrovsky@oracle.com>
 <20160203185525.GV20964@wotan.suse.de> <56B25F0C.2050808@oracle.com>
 <20160203234026.GS20964@wotan.suse.de>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>,
        David Vrabel <david.vrabel@citrix.com>, konrad.wilk@oracle.com,
        xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org,
        roger.pau@citrix.com, x86@kernel.org, GLin@suse.coma,
        bblanco@plumgrid.com, pmonclus@plumgrid.com, bp@suse.de, hpa@zytor.com
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Message-ID: <56B3AC67.7080704@oracle.com>
Date: Thu, 4 Feb 2016 14:54:15 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.1.0
MIME-Version: 1.0
In-Reply-To: <20160203234026.GS20964@wotan.suse.de>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-Source-IP: aserv0021.oracle.com [141.146.126.233]
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 02/03/2016 06:40 PM, Luis R. Rodriguez wrote:
> On Wed, Feb 03, 2016 at 03:11:56PM -0500, Boris Ostrovsky wrote:
>> On 02/03/2016 01:55 PM, Luis R. Rodriguez wrote:
>>> I saw no considerations for the recommendations I had made last on your v1:
>>>
>>> https://lkml.kernel.org/r/CAB=NE6XPA0YzbnM8=rspkKai6d3GkXXO00Gr0VZUYoyzNy6thw@mail.gmail.com
>>>
>>> Of importance:
>>>
>>> 1) Using pv_info.paravirt_enabled = 1 is wrong unless you mean to say this
>>>     is for legacy x86:
>>>
>>> Your patch #3 keeps on setting pv_info.paravirt_enabled = 1 and as discussed
>>> this is wrong. It will be renamed to x86_legacy_free() to align with what folks
>>> are pushing for a BIOS flag to annotate if a system requires legacy x86 stuff.
>>> This also means re-thinking all use cases and ensuring subarch is used then
>>> instead when the goal was to avoid Xen from entering that code. Today Xen does
>>> not use this but with my work it does and it helps clean and brush up a lot of
>>> these checks with future prospects to even help unify entry points.
>> As I said earlier, I am not sure I understand what subarch buys us
>> for HVMlite guests.
> I accepted subarch may not be the right thing, so proposed a hypervisor type.

I don't see much difference between having an HV-specific subarch and a 
hypervisor type.

> What it buys you is a strong semantics association between code designed
> for a purpose.
>
>> As for using paravirt_enabled -- this is really only used to
>> differentiate HVM from HVMlite and I think (although I'd need to
>> check) is only needed by Xen-specific code in a couple of places.
> That sounds like a Xen specific use case as such an interface that is
> pointed out as going to renamed to reflect its actual use case should not
> be abused for that purpose.
>
>> So if/when it is removed we will switch to something else. Since your work is
>> WIP I decided to keep using it until it's clear what other options may be
>> available.
> And your work is not WIP? I'll be splitting my patches up and the rename
> will be atomic, it likely can go in first than yours, so not sure why you
> are simply brushing this off.

I didn't mean to imply anything by saying that your patches are a WIP. 
It's just that I can only write and test my patches against existing 
code, not the future one.

I am sorry if you felt I was trying to say something else, it certainly 
was not my intent.

>
>>> 2) We should avoid more hypervisor type hacks, and just consider a new
>>>     hypervisor type to close the gap:
>>>
>>> Using x86_legacy_free() and friends in a unified way for all systems means it
>>> should only be used after init_hypervisor_platform() which is called during
>>> setup_arch().  This means we have a semantic gap for checks on "are we on
>>> hypervisor type and which one?".
>> In this particular case we don't need any information about
>> hypervisor until init_hypervisor_platform().
> I pointed out in your v1 patchset how microcode loading was not blocked, you
> then asked how KVM does it, and that was explained as well, and that they
> don't enable it as well. You need a solution for this.

Not really. Xen will ignore writes to microcode-specific MSRs, just like 
KVM.

This is exact same behavior we have now with regular HVM guests.


> As-is the x86 boot protocol would not allow an easy way for this, I'm 
> suggesting we consider extending the boot protocol to add a hypervisor 
> type and data pointer much as with subarch and subarch_data for the

Who will set hypervisor type and where? It won't be Xen as Andrew 
mentioned in another email.

> particular purpose of both enabling entry into the same startup_32()
> but also a clean way for modifications of stubs both at the beginning
> and at the end of startup_32().
>
> Pseudo code:
>
> startup_32()                         startup_64()
>         |                                  |
>         |                                  |
>         V                                  V
> pre_hypervisor_stub_32()	pre_hypervisor_stub_64()
>         |                                  |
>         |                                  |
>         V                                  V
>   [existing startup_32()]       [existing startup_64()]
>         |                                  |
>         |                                  |
>         V                                  V
> post_hypervisor_stub_32()	post_hypervisor_stub_64()
>
> The pre_hypervisor_stub_32() would have much of the code in
> hvmlite_start_xen() but for 32-bit, pre_hypervisor_stub_64()
> would have the 64-bits.


Sure. When the protocol is agreed upon and this code is written we will 
just move hvmlite_start_xen() to pre_hypervisor_stub_32().


> +int xen_hvmlite __attribute__((section(".data"))) = 0;
> +struct hvm_start_info hvmlite_start_info __attribute__((section(".data")));
> +uint hvmlite_start_info_sz = sizeof(hvmlite_start_info);
> +struct boot_params xen_hvmlite_boot_params __attribute__((section(".data")));
> +#endif
> +
>>> The section annotations seems very special use case but likely worth documenting
>>> and defining a new macro for in include/linux/compiler.h. This would make it
>>> easier to change should we want to change the section used here later and
>>> enable others to easily look for the reason for these annotations in a
>>> single place.
>> I wonder whether __initdata would be a good attribute. We only need
>> this early in the boot.
> I could not find other users of .data other than some specific driver.
> Using anything with *init* alludes you can free the data later but if we
> want to keep it I suggest a different prefix, up to you.

That's why I said that we only need this info early in the boot.

-boris