From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12801C2BA2B for ; Wed, 15 Apr 2020 06:51:18 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D886A206D9 for ; Wed, 15 Apr 2020 06:51:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D886A206D9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kaod.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44436 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jObtF-0001Ln-2u for qemu-devel@archiver.kernel.org; Wed, 15 Apr 2020 02:51:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42517) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jObsN-0000Km-34 for qemu-devel@nongnu.org; Wed, 15 Apr 2020 02:50:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jObsL-0006zj-HO for qemu-devel@nongnu.org; Wed, 15 Apr 2020 02:50:23 -0400 Received: from 3.mo2.mail-out.ovh.net ([46.105.58.226]:50075) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1jObsL-0006vm-9O for qemu-devel@nongnu.org; Wed, 15 Apr 2020 02:50:21 -0400 Received: from player797.ha.ovh.net (unknown [10.110.171.5]) by mo2.mail-out.ovh.net (Postfix) with ESMTP id B212A1D3401 for ; Wed, 15 Apr 2020 08:50:11 +0200 (CEST) Received: from kaod.org (82-64-250-170.subs.proxad.net [82.64.250.170]) (Authenticated sender: clg@kaod.org) by player797.ha.ovh.net (Postfix) with ESMTPSA id 98C70D5CDE8E; Wed, 15 Apr 2020 06:49:58 +0000 (UTC) Subject: Re: [EXTERNAL] [PATCH] target/ppc: Fix mtmsr(d) L=1 variant that loses interrupts To: Nicholas Piggin , qemu-devel@nongnu.org References: <20200414111131.465560-1-npiggin@gmail.com> From: =?UTF-8?Q?C=c3=a9dric_Le_Goater?= Message-ID: <74e47708-fcc0-d3db-5f6b-2a513722fef9@kaod.org> Date: Wed, 15 Apr 2020 08:49:58 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <20200414111131.465560-1-npiggin@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US X-Ovh-Tracer-Id: 12583901784752425818 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduhedrfedvgdduuddtucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhepuffvfhfhkffffgggjggtgfesthekredttdefjeenucfhrhhomhepveorughrihgtpgfnvggpifhorghtvghruceotghlgheskhgrohgurdhorhhgqeenucffohhmrghinhepphhrohhofhhpohhinhhtrdgtohhmpdhmshhgtddtkedviedrhhhtmhhlpddvtdekfedtuddrhhhtmhhlpdhhrghnugihrdhtrghrghgvthenucfkpheptddrtddrtddrtddpkedvrdeigedrvdehtddrudejtdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhhouggvpehsmhhtphdqohhuthdphhgvlhhopehplhgrhigvrhejleejrdhhrgdrohhvhhdrnhgvthdpihhnvghtpedtrddtrddtrddtpdhmrghilhhfrhhomheptghlgheskhgrohgurdhorhhgpdhrtghpthhtohepqhgvmhhuqdguvghvvghlsehnohhnghhnuhdrohhrgh Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 46.105.58.226 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-stable@nongnu.org, qemu-ppc@nongnu.org, Nathan Chancellor , Anton Blanchard , linuxppc-dev@lists.ozlabs.org, David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On 4/14/20 1:11 PM, Nicholas Piggin wrote: > If mtmsr L=3D1 sets MSR[EE] while there is a maskable exception pending= , > it does not cause an interrupt. This causes the test case to hang: >=20 > https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__lists.gnu.org_ar= chive_html_qemu-2Dppc_2019-2D10_msg00826.html&d=3DDwIDAg&c=3Djf_iaSHvJObT= bx-siA1ZOg&r=3DXHJsZhhuWSw9713Fp0ciew&m=3DTQfi_v-8XYgz7MiMDAZ_CjkyalSh9-E= XhQ3oDesUm74&s=3DpFoesXbioVBh5wCuzEnzwgfze6X7e-a9unkfUgsRwiw&e=3D=20 >=20 > More recently, Linux reduced the occurance of operations (e.g., rfi) > which stop translation and allow pending interrupts to be processed. > This started causing hangs in Linux boot in long-running kernel tests, > running with '-d int' shows the decrementer stops firing despite DEC > wrapping and MSR[EE]=3D1. >=20 > https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__lists.ozlabs.org= _pipermail_linuxppc-2Ddev_2020-2DApril_208301.html&d=3DDwIDAg&c=3Djf_iaSH= vJObTbx-siA1ZOg&r=3DXHJsZhhuWSw9713Fp0ciew&m=3DTQfi_v-8XYgz7MiMDAZ_Cjkyal= Sh9-EXhQ3oDesUm74&s=3DEhkRfxvQMomvneYweWDEIUktCkKykgIqEmdhA0PtiwU&e=3D=20 >=20 > The cause is the broken mtmsr L=3D1 behaviour, which is contrary to the > architecture. From Power ISA v3.0B, p.977, Move To Machine State Regist= er, > Programming Note states: >=20 > If MSR[EE]=3D0 and an External, Decrementer, or Performance Monitor > exception is pending, executing an mtmsrd instruction that sets > MSR[EE] to 1 will cause the interrupt to occur before the next > instruction is executed, if no higher priority exception exists >=20 > Fix this by handling L=3D1 exactly the same way as L=3D0, modulo the MS= R > bits altered. >=20 > The confusion arises from L=3D0 being "context synchronizing" whereas L= =3D1 > is "execution synchronizing", which is a weaker semantic. However this > is not a relaxation of the requirement that these exceptions cause > interrupts when MSR[EE]=3D1 (e.g., when mtmsr executes to completion as > TCG is doing here), rather it specifies how a pipelined processor can > have multiple instructions in flight where one may influence how anothe= r > behaves. I was expecting more changes but this looks fine.=20 Reviewed-by: C=C3=A9dric Le Goater > Cc: qemu-stable@nongnu.org > Reported-by: Anton Blanchard > Reported-by: Nathan Chancellor > Tested-by: Nathan Chancellor > Signed-off-by: Nicholas Piggin I gave it a try on PowerNV, pseries and mac99. All good. Tested-by: C=C3=A9dric Le Goater I don't know how we could include tests in QEMU such as the one Anton=20 sent. These are good exercisers for our exception model. Thanks, C.=20 > --- > Thanks very much to Nathan for reporting and testing it, I added his > Tested-by tag despite a more polished patch, as the the basics are=20 > still the same (and still fixes his test case here). >=20 > This bug possibly goes back to early v2.04 / mtmsrd L=3D1 support aroun= d > 2007, and the code has been changed several times since then so may > require some backporting. >=20 > 32-bit / mtmsr untested at the moment, I don't have an environment > handy. > >=20 > target/ppc/translate.c | 46 +++++++++++++++++++++++++----------------- > 1 file changed, 27 insertions(+), 19 deletions(-) >=20 > diff --git a/target/ppc/translate.c b/target/ppc/translate.c > index b207fb5386..9959259dba 100644 > --- a/target/ppc/translate.c > +++ b/target/ppc/translate.c > @@ -4361,30 +4361,34 @@ static void gen_mtmsrd(DisasContext *ctx) > CHK_SV; > =20 > #if !defined(CONFIG_USER_ONLY) > + if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) { > + gen_io_start(); > + } > if (ctx->opcode & 0x00010000) { > - /* Special form that does not need any synchronisation */ > + /* L=3D1 form only updates EE and RI */ > TCGv t0 =3D tcg_temp_new(); > + TCGv t1 =3D tcg_temp_new(); > tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)], > (1 << MSR_RI) | (1 << MSR_EE)); > - tcg_gen_andi_tl(cpu_msr, cpu_msr, > + tcg_gen_andi_tl(t1, cpu_msr, > ~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE))= ); > - tcg_gen_or_tl(cpu_msr, cpu_msr, t0); > + tcg_gen_or_tl(t1, t1, t0); > + > + gen_helper_store_msr(cpu_env, t1); > tcg_temp_free(t0); > + tcg_temp_free(t1); > + > } else { > /* > * XXX: we need to update nip before the store if we enter > * power saving mode, we will exit the loop directly from > * ppc_store_msr > */ > - if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) { > - gen_io_start(); > - } > gen_update_nip(ctx, ctx->base.pc_next); > gen_helper_store_msr(cpu_env, cpu_gpr[rS(ctx->opcode)]); > - /* Must stop the translation as machine state (may have) chang= ed */ > - /* Note that mtmsr is not always defined as context-synchroniz= ing */ > - gen_stop_exception(ctx); > } > + /* Must stop the translation as machine state (may have) changed *= / > + gen_stop_exception(ctx); > #endif /* !defined(CONFIG_USER_ONLY) */ > } > #endif /* defined(TARGET_PPC64) */ > @@ -4394,15 +4398,23 @@ static void gen_mtmsr(DisasContext *ctx) > CHK_SV; > =20 > #if !defined(CONFIG_USER_ONLY) > - if (ctx->opcode & 0x00010000) { > - /* Special form that does not need any synchronisation */ > + if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) { > + gen_io_start(); > + } > + if (ctx->opcode & 0x00010000) { > + /* L=3D1 form only updates EE and RI */ > TCGv t0 =3D tcg_temp_new(); > + TCGv t1 =3D tcg_temp_new(); > tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)], > (1 << MSR_RI) | (1 << MSR_EE)); > - tcg_gen_andi_tl(cpu_msr, cpu_msr, > + tcg_gen_andi_tl(t1, cpu_msr, > ~(target_ulong)((1 << MSR_RI) | (1 << MSR_EE))= ); > - tcg_gen_or_tl(cpu_msr, cpu_msr, t0); > + tcg_gen_or_tl(t1, t1, t0); > + > + gen_helper_store_msr(cpu_env, t1); > tcg_temp_free(t0); > + tcg_temp_free(t1); > + > } else { > TCGv msr =3D tcg_temp_new(); > =20 > @@ -4411,9 +4423,6 @@ static void gen_mtmsr(DisasContext *ctx) > * power saving mode, we will exit the loop directly from > * ppc_store_msr > */ > - if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) { > - gen_io_start(); > - } > gen_update_nip(ctx, ctx->base.pc_next); > #if defined(TARGET_PPC64) > tcg_gen_deposit_tl(msr, cpu_msr, cpu_gpr[rS(ctx->opcode)], 0, = 32); > @@ -4422,10 +4431,9 @@ static void gen_mtmsr(DisasContext *ctx) > #endif > gen_helper_store_msr(cpu_env, msr); > tcg_temp_free(msr); > - /* Must stop the translation as machine state (may have) chang= ed */ > - /* Note that mtmsr is not always defined as context-synchroniz= ing */ > - gen_stop_exception(ctx); > } > + /* Must stop the translation as machine state (may have) changed *= / > + gen_stop_exception(ctx); > #endif > } > =20 >=20