From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5BD8C6FD19 for ; Mon, 13 Mar 2023 11:59:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229689AbjCML7C (ORCPT ); Mon, 13 Mar 2023 07:59:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229609AbjCML7A (ORCPT ); Mon, 13 Mar 2023 07:59:00 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95E2C1E281 for ; Mon, 13 Mar 2023 04:58:56 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7790EB81054 for ; Mon, 13 Mar 2023 11:43:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0DE5EC433EF; Mon, 13 Mar 2023 11:43:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1678707784; bh=e4ydQwgpHJpYMqRieOkjxoLQgnguRet14HFf/fcXYQQ=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=rx2jcH1MjMOov2vCASpxslDif67ABuE2pJNbxarbgDpvNMXsgVTR+XTGobgwa2Bkm V37IbAoGYUDBh4WFLeXllu9uUpuMIGhgZUPK8VZwqv+JPKBZFcMw3+10Isl2X0XhLN qBR4kSNqSNeG4e+oapOx/rpAR7VDdY4geqWQC20W/VA14mRoXznas8y+IF19bhcDv2 KQIvHz9q1jnwnmE8v3gZl7WeXt9M8QzGDXVaRakff268nVbjr8rK2vVFct1EaVo/Mq CmXE2tUjCihAEwUOJ9Z9bzBN5FsUr8NSvRfAMuVqysY539O9RGnmRtntzIseVwfJAr lrCUpej8p4h8Q== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pbgZx-00HDLV-DJ; Mon, 13 Mar 2023 11:43:01 +0000 Date: Mon, 13 Mar 2023 11:43:01 +0000 Message-ID: <86o7owyj0a.wl-maz@kernel.org> From: Marc Zyngier To: Colton Lewis Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com, ricarkol@google.com, sveith@amazon.de, dwmw2@infradead.org Subject: Re: [PATCH 15/16] KVM: arm64: selftests: Augment existing timer test to handle variable offsets In-Reply-To: References: <87a60m9u3a.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: coltonlewis@google.com, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com, ricarkol@google.com, sveith@amazon.de, dwmw2@infradead.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Fri, 10 Mar 2023 19:26:47 +0000, Colton Lewis wrote: > > Marc Zyngier writes: > > >> mvbbq9:/data/coltonlewis/ecv/arm64-obj/kselftest/kvm# > >> ./aarch64/arch_timer -O 0xffff > >> ==== Test Assertion Failure ==== > >> aarch64/arch_timer.c:239: false > >> pid=48094 tid=48095 errno=4 - Interrupted system call > >> 1 0x4010fb: test_vcpu_run at arch_timer.c:239 > >> 2 0x42a5bf: start_thread at pthread_create.o:0 > >> 3 0x46845b: thread_start at clone.o:0 > >> Failed guest assert: xcnt >= cval at aarch64/arch_timer.c:151 > >> values: 2500645901305, 2500645961845; 9939, vcpu 0; stage; 3; iter: 2 > > > The fun part is that you can see similar things without the series: > > > ==== Test Assertion Failure ==== > > aarch64/arch_timer.c:239: false > > pid=647 tid=651 errno=4 - Interrupted system call > > 1 0x00000000004026db: test_vcpu_run at arch_timer.c:239 > > 2 0x00007fffb13cedd7: ?? ??:0 > > 3 0x00007fffb1437e9b: ?? ??:0 > > Failed guest assert: config_iter + 1 == irq_iter at > > aarch64/arch_timer.c:188 > > values: 2, 3; 0, vcpu 3; stage; 4; iter: 3 > > > That's on a vanilla kernel (6.2-rc4) on an M1 with the test run > > without any argument in a loop. After a few iterations, it blows. I finally got to the bottom of that one. This is yet another case of the test making the assumption that spurious interrupts don't exist... Here, the timer interrupt has been masked at the source, but the GIC (or its emulation) can be slow to retire it. So we take it again, spuriously, and account it as a true interrupt. None of the asserts in the timer handler fire because they only check the *previous* state. Eventually, the interrupt retires and we progress to the next iteration. But in the meantime, we have incremented the irq counter by the number of spurious events, and the test fails. The obvious fix is to check for the timer state in the handler and exit early if the timer interrupt is masked or the timer disabled. With that, I don't see these failures anymore. I've folded that into the patch that already deals with some spurious events. M. -- Without deviation from the norm, progress is not possible.