From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83746C04AB5 for ; Thu, 6 Jun 2019 05:18:39 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 47B2020717 for ; Thu, 6 Jun 2019 05:18:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 47B2020717 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([127.0.0.1]:54715 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hYknO-0002K0-GL for qemu-devel@archiver.kernel.org; Thu, 06 Jun 2019 01:18:38 -0400 Received: from eggs.gnu.org ([209.51.188.92]:46867) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hYkmU-0001wn-Lg for qemu-devel@nongnu.org; Thu, 06 Jun 2019 01:17:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hYkmS-0006lX-Ua for qemu-devel@nongnu.org; Thu, 06 Jun 2019 01:17:42 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:35504) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hYkmS-0006kU-IY for qemu-devel@nongnu.org; Thu, 06 Jun 2019 01:17:40 -0400 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x565HFlm137660 for ; Thu, 6 Jun 2019 01:17:39 -0400 Received: from e16.ny.us.ibm.com (e16.ny.us.ibm.com [129.33.205.206]) by mx0a-001b2d01.pphosted.com with ESMTP id 2sxr5xjj0g-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 06 Jun 2019 01:17:38 -0400 Received: from localhost by e16.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 6 Jun 2019 06:17:37 +0100 Received: from b01cxnp22036.gho.pok.ibm.com (9.57.198.26) by e16.ny.us.ibm.com (146.89.104.203) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 6 Jun 2019 06:17:34 +0100 Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com [9.57.199.107]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x565HXBZ16974094 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Jun 2019 05:17:33 GMT Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3018E124066; Thu, 6 Jun 2019 05:17:33 +0000 (GMT) Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7E0A3124053; Thu, 6 Jun 2019 05:17:26 +0000 (GMT) Received: from [9.85.82.46] (unknown [9.85.82.46]) by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 6 Jun 2019 05:17:25 +0000 (GMT) To: Greg Kurz References: <155910829070.13149.5215948335633966328.stgit@aravinda> <155910841478.13149.2830700794862210739.stgit@aravinda> <20190603121243.687d8d75@bahia.lab.toulouse-stg.fr.ibm.com> <20190603131723.261eba99@bahia.lab.toulouse-stg.fr.ibm.com> <4689dbb8-1ba1-2def-f9a7-b2f5a35bad18@linux.vnet.ibm.com> <20190604165026.487ca347@bahia.lab.toulouse-stg.fr.ibm.com> From: Aravinda Prasad Date: Thu, 6 Jun 2019 10:47:23 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20190604165026.487ca347@bahia.lab.toulouse-stg.fr.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 19060605-0072-0000-0000-000004382DE0 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00011222; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000286; SDB=6.01213903; UDB=6.00638056; IPR=6.00994983; MB=3.00027202; MTD=3.00000008; XFM=3.00000015; UTC=2019-06-06 05:17:36 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19060605-0073-0000-0000-00004C84AD2A Message-Id: <49938b06-4463-150d-1f7f-2b77bbf16d03@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-06-06_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906060039 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 148.163.156.1 Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH v9 1/6] ppc: spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: aik@au1.ibm.com, qemu-devel@nongnu.org, paulus@ozlabs.org, qemu-ppc@nongnu.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On Tuesday 04 June 2019 08:20 PM, Greg Kurz wrote: > On Tue, 4 Jun 2019 11:38:31 +0530 > Aravinda Prasad wrote: > >> On Monday 03 June 2019 04:47 PM, Greg Kurz wrote: >>> On Mon, 3 Jun 2019 12:12:43 +0200 >>> Greg Kurz wrote: >>> >>>> On Wed, 29 May 2019 11:10:14 +0530 >>>> Aravinda Prasad wrote: >>>> >>>>> This patch adds support in QEMU to handle "ibm,nmi-register" >>>>> and "ibm,nmi-interlock" RTAS calls. >>>>> >>>>> The machine check notification address is saved when the >>>>> OS issues "ibm,nmi-register" RTAS call. >>>>> >>>>> This patch also handles the case when multiple processors >>>>> experience machine check at or about the same time by >>>>> handling "ibm,nmi-interlock" call. In such cases, as per >>>>> PAPR, subsequent processors serialize waiting for the first >>>>> processor to issue the "ibm,nmi-interlock" call. The second >>>>> processor that also received a machine check error waits >>>>> till the first processor is done reading the error log. >>>>> The first processor issues "ibm,nmi-interlock" call >>>>> when the error log is consumed. This patch implements the >>>>> releasing part of the error-log while subsequent patch >>>>> (which builds error log) handles the locking part. >>>>> >>>>> Signed-off-by: Aravinda Prasad >>>>> Reviewed-by: David Gibson >>>>> --- >>>> >>>> The code looks okay but it still seems wrong to advertise the RTAS >>>> calls to the guest that early in the series. The linux kernel in >>>> the guest will assume FWNMI is functional, which isn't true until >>>> patch 6 (yes, migration is part of the feature, it should be >>>> supported upfront, not fixed afterwards). >>>> >>>> It doesn't help much to introduce the RTAS calls early and to >>>> modify them in the other patches. I'd rather see the rest of >>>> the code first and a final patch that introduces the fully >>>> functional RTAS calls and calls spapr_rtas_register(). >>>> >>> >>> Thinking again, you should introduce the "fwnmi-mce" spapr capability in >>> its own patch first, default to "off" and and have the last patch in the >>> series to switch the default to "on" for newer machine types only. >>> >>> This patch should then only register the RTAS calls if "fwnmi-mcr" is set >>> to "on". >>> >>> This should address the fact that we don't want to expose a partially >>> implemented FWNMI feature to the guest, and we don't want to support >>> FWNMI at all with older machine types for the sake of compatibility. >> >> When you say "expose a partially implemented FWNMI feature to the >> guest", do you mean while debugging/bisect we may end up with exposing >> the partially implemented FWNMI feature? Otherwise it is expected that > > Yes, we don't want to break someone else's bisect. ok. > >> QEMU runs with all the 6 patches. >> >> If that is the case, I will have the rtas nmi register functionality as >> the last patch in the series. This way we don't have to have spapr cap >> turned off first and later turned on. However, as mentioned earlier >> (when David raised the same concern), use of guest_machine_check_addr >> may look odd at other patches as it is set only during rtas nmi register. >> > > Why odd ? see below > >> Or else, as a workaround, I can return RTAS_OUT_NOT_SUPPORTED for rtas >> nmi register till the entire functionality is implemented and only in >> the last patch in the series I will return RTAS_OUT_SUCCESS. This will >> ensure that we have a logical connection between the patches and the >> partially implemented fwnmi is not exposed to the guest kernel. >> > > Not exactly true. FWNMI would be exposed to the guest in the device tree > and the guest kernel would _just_ fail to set the fwnmi_active global: > > if (0 == rtas_call(ibm_nmi_register, 2, 1, NULL, system_reset_addr, > machine_check_addr)) > fwnmi_active = 1; Sorry for the confusion. I thought the suggestion was to introduce rtas_ibm_nmi_register() call later in the series, but now I see that I can still have rtas_ibm_nmi_register(), but have a final patch that calls spapr_rtas_register(). > >> Regards, >> Aravinda >> >> >> >> >>> >>>>> hw/ppc/spapr.c | 7 +++++ >>>>> hw/ppc/spapr_rtas.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++ >>>>> include/hw/ppc/spapr.h | 9 ++++++- >>>>> 3 files changed, 80 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >>>>> index e2b33e5..fae28a9 100644 >>>>> --- a/hw/ppc/spapr.c >>>>> +++ b/hw/ppc/spapr.c >>>>> @@ -1808,6 +1808,11 @@ static void spapr_machine_reset(void) >>>>> first_ppc_cpu->env.gpr[5] = 0; >>>>> >>>>> spapr->cas_reboot = false; >>>>> + >>>>> + spapr->guest_machine_check_addr = -1; >>>>> + >>>>> + /* Signal all vCPUs waiting on this condition */ >>>>> + qemu_cond_broadcast(&spapr->mc_delivery_cond); >>>>> } >>>>> >>>>> static void spapr_create_nvram(SpaprMachineState *spapr) >>>>> @@ -3072,6 +3077,8 @@ static void spapr_machine_init(MachineState *machine) >>>>> >>>>> kvmppc_spapr_enable_inkernel_multitce(); >>>>> } >>>>> + >>>>> + qemu_cond_init(&spapr->mc_delivery_cond); >>>>> } >>>>> >>>>> static int spapr_kvm_type(MachineState *machine, const char *vm_type) >>>>> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c >>>>> index 5bc1a93..e7509cf 100644 >>>>> --- a/hw/ppc/spapr_rtas.c >>>>> +++ b/hw/ppc/spapr_rtas.c >>>>> @@ -352,6 +352,38 @@ static void rtas_get_power_level(PowerPCCPU *cpu, SpaprMachineState *spapr, >>>>> rtas_st(rets, 1, 100); >>>>> } >>>>> >>>>> +static void rtas_ibm_nmi_register(PowerPCCPU *cpu, >>>>> + SpaprMachineState *spapr, >>>>> + uint32_t token, uint32_t nargs, >>>>> + target_ulong args, >>>>> + uint32_t nret, target_ulong rets) >>>>> +{ >>>>> + hwaddr rtas_addr = spapr_get_rtas_addr(); >>>>> + >>>>> + if (!rtas_addr) { >>>>> + rtas_st(rets, 0, RTAS_OUT_NOT_SUPPORTED); >>>>> + return; >>>>> + } >>>>> + >>>>> + spapr->guest_machine_check_addr = rtas_ld(args, 1); >>>>> + rtas_st(rets, 0, RTAS_OUT_SUCCESS); >>>>> +} >>>>> + >>>>> +static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu, >>>>> + SpaprMachineState *spapr, >>>>> + uint32_t token, uint32_t nargs, >>>>> + target_ulong args, >>>>> + uint32_t nret, target_ulong rets) >>>>> +{ >>>>> + if (spapr->guest_machine_check_addr == -1) { >>>>> + /* NMI register not called */ >>>>> + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); >>>>> + } else { >>>>> + qemu_cond_signal(&spapr->mc_delivery_cond); >>>>> + rtas_st(rets, 0, RTAS_OUT_SUCCESS); >>>>> + } >>>>> +} >>>>> + >>>>> static struct rtas_call { >>>>> const char *name; >>>>> spapr_rtas_fn fn; >>>>> @@ -470,6 +502,35 @@ void spapr_load_rtas(SpaprMachineState *spapr, void *fdt, hwaddr addr) >>>>> } >>>>> } >>>>> >>>>> +hwaddr spapr_get_rtas_addr(void) >>>>> +{ >>>>> + SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine()); >>>>> + int rtas_node; >>>>> + const struct fdt_property *rtas_addr_prop; >>>>> + void *fdt = spapr->fdt_blob; >>>>> + uint32_t rtas_addr; >>>>> + >>>>> + /* fetch rtas addr from fdt */ >>>>> + rtas_node = fdt_path_offset(fdt, "/rtas"); >>>>> + if (rtas_node == 0) { >>>>> + return 0; >>>>> + } >>>>> + >>>>> + rtas_addr_prop = fdt_get_property(fdt, rtas_node, "linux,rtas-base", NULL); >>>>> + if (!rtas_addr_prop) { >>>>> + return 0; >>>>> + } >>>>> + >>>>> + /* >>>>> + * We assume that the OS called RTAS instantiate-rtas, but some other >>>>> + * OS might call RTAS instantiate-rtas-64 instead. This fine as of now >>>>> + * as SLOF only supports 32-bit variant. >>>>> + */ >>>>> + rtas_addr = fdt32_to_cpu(*(uint32_t *)rtas_addr_prop->data); >>>>> + return (hwaddr)rtas_addr; >>>>> +} >>>>> + >>>>> + >>>>> static void core_rtas_register_types(void) >>>>> { >>>>> spapr_rtas_register(RTAS_DISPLAY_CHARACTER, "display-character", >>>>> @@ -493,6 +554,10 @@ static void core_rtas_register_types(void) >>>>> rtas_set_power_level); >>>>> spapr_rtas_register(RTAS_GET_POWER_LEVEL, "get-power-level", >>>>> rtas_get_power_level); >>>>> + spapr_rtas_register(RTAS_IBM_NMI_REGISTER, "ibm,nmi-register", >>>>> + rtas_ibm_nmi_register); >>>>> + spapr_rtas_register(RTAS_IBM_NMI_INTERLOCK, "ibm,nmi-interlock", >>>>> + rtas_ibm_nmi_interlock); >>>>> } >>>>> >>>>> type_init(core_rtas_register_types) >>>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h >>>>> index 4f5becf..9dc5e30 100644 >>>>> --- a/include/hw/ppc/spapr.h >>>>> +++ b/include/hw/ppc/spapr.h >>>>> @@ -188,6 +188,10 @@ struct SpaprMachineState { >>>>> * occurs during the unplug process. */ >>>>> QTAILQ_HEAD(, SpaprDimmState) pending_dimm_unplugs; >>>>> >>>>> + /* State related to "ibm,nmi-register" and "ibm,nmi-interlock" calls */ >>>>> + target_ulong guest_machine_check_addr; >>>>> + QemuCond mc_delivery_cond; >>>>> + >>>>> /*< public >*/ >>>>> char *kvm_type;This means it isn't related to XIVE it to set >>>>> char *host_model; >>>>> @@ -624,8 +628,10 @@ target_ulong spapr_hypercall(PowerPCCPU *cpu, target_ulong opcode, >>>>> #define RTAS_IBM_CREATE_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x27) >>>>> #define RTAS_IBM_REMOVE_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x28) >>>>> #define RTAS_IBM_RESET_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x29) >>>>> +#define RTAS_IBM_NMI_REGISTER (RTAS_TOKEN_BASE + 0x2A) >>>>> +#define RTAS_IBM_NMI_INTERLOCK (RTAS_TOKEN_BASE + 0x2B) >>>>> >>>>> -#define RTAS_TOKEN_MAX (RTAS_TOKEN_BASE + 0x2A) >>>>> +#define RTAS_TOKEN_MAX (RTAS_TOKEN_BASE + 0x2C) >>>>> >>>>> /* RTAS ibm,get-system-parameter token values */ >>>>> #define RTAS_SYSPARM_SPLPAR_CHARACTERISTICS 20 >>>>> @@ -876,4 +882,5 @@ void spapr_check_pagesize(SpaprMachineState *spapr, hwaddr pagesize, >>>>> #define SPAPR_OV5_XIVE_BOTH 0x80 /* Only to advertise on the platform */ >>>>> >>>>> void spapr_set_all_lpcrs(target_ulong value, target_ulong mask); >>>>> +uint64_t spapr_get_rtas_addr(void); >>>>> #endif /* HW_SPAPR_H */ >>>>> >>>> >>>> >>> >> > -- Regards, Aravinda