From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64D4DC00140 for ; Fri, 12 Aug 2022 15:02:49 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id CE74B4EDC3; Fri, 12 Aug 2022 11:02:48 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Authentication-Results: mm01.cs.columbia.edu (amavisd-new); dkim=softfail (fail, message has been altered) header.i=@kernel.org Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qoGkbjB+ACTY; Fri, 12 Aug 2022 11:02:47 -0400 (EDT) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 8CF8F4EC43; Fri, 12 Aug 2022 11:02:47 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 466AF4EBE9 for ; Fri, 12 Aug 2022 11:02:46 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 32-9+1tuZQ5N for ; Fri, 12 Aug 2022 11:02:44 -0400 (EDT) Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id C640F4EBB1 for ; Fri, 12 Aug 2022 11:02:44 -0400 (EDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 60EB6B82447; Fri, 12 Aug 2022 15:02:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 181E8C433C1; Fri, 12 Aug 2022 15:02:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1660316562; bh=0WIPonzh8UP/C2DNgLQtcEkMTLeNrhVRs4GOGjQm2xI=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=ToW1FfDkfFCNGCU4s7rW5Ojr1KjM+w4+tCEyf4NBpF0WYsRe8N0Qt0sqTRoBXNfU4 SfMBc7YwQcuUA5yuTv1OxjhgCtihnaqU1G5JwNUmbaY75Yt2FyrKMEveNjdYrvmuKw LfAA7rO1BfpaPGImC2mOu1MaUTfPYsQiosCNEtZgup2uAroza54siTKVJN03bUBAPY JOa9p/C7XAURwIt/XLiIPfty0OQIxEQE5OXX7zpbFv5lc5wwZscntxdphQZMGH+FVI 8FhH5APNCANCQtmS0lAiO1/YYz5c5YJHOupvIwKCaXHpBScV2vBA9ZpznGyzPG6CFw DB04JG9dt9Y2Q== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1oMWBM-002ZnC-1E; Fri, 12 Aug 2022 16:02:40 +0100 Date: Fri, 12 Aug 2022 16:02:37 +0100 Message-ID: <87h72hv71u.wl-maz@kernel.org> From: Marc Zyngier To: Peter Maydell , Vitaly Chikunov Subject: Re: qemu-system-aarch64: Failed to retrieve host CPU features In-Reply-To: References: <20220812021427.cwenhciuftgtaj64@altlinux.org> <20220812084529.ur5qcyws5qvoyvuc@altlinux.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: peter.maydell@linaro.org, vt@altlinux.org, qemu-arm@nongnu.org, kvmarm@lists.cs.columbia.edu, qemu-devel@nongnu.org, ldv@altlinux.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Cc: qemu-arm@nongnu.org, "Dmitry V. Levin" , kvmarm , QEMU Developers X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu Hi Peter, On Fri, 12 Aug 2022 10:25:55 +0100, Peter Maydell wrote: > > I've added some more relevant mailing lists to the cc. > > On Fri, 12 Aug 2022 at 09:45, Vitaly Chikunov wrote: > > On Fri, Aug 12, 2022 at 05:14:27AM +0300, Vitaly Chikunov wrote: > > > I noticed that we starting to get many errors like this: > > > > > > qemu-system-aarch64: Failed to retrieve host CPU features > > > > > > Where many is 1-2% per run, depends on host, host is Kunpeng-920, and > > > Linux kernel is v5.15.59, but it started to appear months before that. > > > > > > strace shows in erroneous case: > > > > > > 1152244 ioctl(9, KVM_CREATE_VM, 0x30) = -1 EINTR (Interrupted system call) > > > > > > And I see in target/arm/kvm.c:kvm_arm_create_scratch_host_vcpu: > > > > > > vmfd = ioctl(kvmfd, KVM_CREATE_VM, max_vm_pa_size); > > > if (vmfd < 0) { > > > goto err; > > > } > > > > > > Maybe it should restart ioctl on EINTR? > > > > > > I don't see EINTR documented in ioctl(2) nor in Linux' > > > Documentation/virt/kvm/api.rst for KVM_CREATE_VM, but for KVM_RUN it > > > says "an unmasked signal is pending". > > > > I am suggested that almost any blocking syscall could return EINTR, so I > > checked the strace log and it does not show evidence of arriving a signal, > > the log ends like this: > > > > 1152244 openat(AT_FDCWD, "/dev/kvm", O_RDWR|O_CLOEXEC) = 9 > > 1152244 ioctl(9, KVM_CHECK_EXTENSION, KVM_CAP_ARM_VM_IPA_SIZE) = 48 > > 1152244 ioctl(9, KVM_CREATE_VM, 0x30) = -1 EINTR (Interrupted system call) > > 1152244 close(9) = 0 > > 1152244 newfstatat(2, "", {st_dev=makedev(0, 0xd), st_ino=57869925, st_mode=S_IFIFO|0600, st_nlink=1, st_uid=517, st_gid=517, st_blksize=4096, st_blocks=0, st_size=0, st_atime=1660268019 /* 2022-08-12T01:33:39.850436293+0000 */, st_atime_nsec=850436293, st_mtime=1660268019 /* 2022-08-12T01:33:39.850436293+0000 */, st_mtime_nsec=850436293, st_ctime=1660268019 /* 2022-08-12T01:33:39.850436293+0000 */, st_ctime_nsec=850436293}, AT_EMPTY_PATH) = 0 > > 1152244 write(2, "qemu-system-aarch64: Failed to r"..., 58) = 58 > > 1152244 exit_group(1) = ? > > 1152245 <... clock_nanosleep resumed> ) = ? > > 1152245 +++ exited with 1 +++ > > 1152244 +++ exited with 1 +++ > > KVM folks: should we expect that KVM_CREATE_VM might fail EINTR > and need retrying? In general, yes. But for this particular one, this is pretty odd. The only path I can so far see that would match this behaviour is if mm_take_all_locks() (called from __mmu_notifier_register()) was getting interrupted by a signal (I'm looking at a 5.19-ish kernel, which may slightly differ from the 5.15 mentioned above). But as Vitaly points out, it doesn't seem to be a signal delivered here. Vitaly: could you please share your exact test case (full qemu command line), and instrument your kernel to see if mm_take_all_locks() is the one failing? Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm