From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7BE4C4332F for ; Mon, 30 Oct 2023 12:36:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233228AbjJ3MgS (ORCPT ); Mon, 30 Oct 2023 08:36:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46958 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232291AbjJ3MgQ (ORCPT ); Mon, 30 Oct 2023 08:36:16 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B8CEC6 for ; Mon, 30 Oct 2023 05:36:14 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A2B04C433C7; Mon, 30 Oct 2023 12:36:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698669373; bh=/71kbiytdp/VoW3kuCJ1kqiDxE4ZD+L4PqRlfCczo0I=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=T/eyqJHUuAonHxnuxkt6GEdACiAWxX9iI1QzI/1RvlpMqU+QIDFtVwJ77UdO0XvJC whof5OokBZX4wF9CnXOJilVIpV5+6Vm3nfMb9Dmn0iL21Ijq/oVFyOmGXToQ3b79uo +Px4uqo3N8WAbbkz+P6kLR1kTuGBpuT5r4SsJf1bH41QU76hfWLInxSyk8xNcNm5ss HbgM4IgDJY+Mr5i+VpNTJ3FnH/IQ+f10OD1v+xEPcU64+kY2ArVy3bmKZxvB7i6q4I k47oIEq2mQDF6WbGwAJZpWuAyBW/R5SZ9/+zn6LmcoBGl3Nm5hIALsSVer6hj82+ad 4rxYLEkan3pew== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1qxRV5-008vI6-5Z; Mon, 30 Oct 2023 12:36:11 +0000 Date: Mon, 30 Oct 2023 12:36:09 +0000 Message-ID: <86msw01e4m.wl-maz@kernel.org> From: Marc Zyngier To: Jan Henrik Weinstock Cc: oliver.upton@linux.dev, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, Lukas =?UTF-8?B?SsO8bmdlcg==?= Subject: Re: KVM exit to userspace on WFI In-Reply-To: References: <87ttql5aq7.wl-maz@kernel.org> <86cyx250w9.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: jan@mwa.re, oliver.upton@linux.dev, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, lukas@mwa.re X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [please make an effort not to top-post] On Fri, 27 Oct 2023 18:41:44 +0100, Jan Henrik Weinstock wrote: > > Hi Marc, > > the basic idea behind this is to have a (single-threaded) execution loop, > something like this: > > vcpu-thread: vcpu-run | process-io-devices | vcpu-run | process-io... > ^ > WFX or timeout > > We switch to simulating IO devices whenever the vcpu is idle (wfi) or exceeds > a certain budget of instructions (counted via pmu). Our fallback currently is > to kick the vcpu out of its execution using a signal (via a timeout/alarm). But > of course, if the cpu is stuck at a wfi, we are wasting a lot of time. > > I understand that the proposed behavior is not desirable for most use cases, > which is why I suggest locking it behind a flag, e.g. > KVM_ARCH_FLAG_WFX_EXIT_TO_USER. But how do you reconcile the fact that exposing this to userspace breaks fundamental expectations that the guest has, such as getting its timer interrupts and directly injected LPIs? Implementing WFI in userspace breaks it. What about the case where we don't trap WFx and let the *guest* wait for an interrupt? Honestly, what you are describing seems to be a use model that doesn't fit KVM, which is a general purpose hypervisor, but more a simulation environment. Yes, the primitives are the same, but the plumbing is wildly different. *If* that's the stuff you're looking at, then I'm afraid you'll have to do it in different way, because what you are suggesting is fundamentally incompatible with the guarantees that KVM gives to guest and userspace. Because your KVM_ARCH_FLAG_WFX_EXIT_TO_USER is really a lie. It should really be named something more along the lines of KVM_ARCH_FLAG_WFX_EXIT_TO_USER_SOMETIME_AND_I_DONT_EVEN_KNOW_WHEN (probably with additional clauses related to breaking things). Overall, you are still asking for something that is not guaranteed at the architecture level, even less in KVM, and I'm not going to add support for something that can only work "sometime". M. -- Without deviation from the norm, progress is not possible.