From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8B48EC7619A for ; Wed, 5 Apr 2023 07:50:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=nPA456vfJqbmlDvi0SC1WvA47n7kiDr4iybIcRKSfZE=; b=ksyfg/m1qqIWdG zaJSh6NaQMLqB4oSlWS5P2zM6aM5R1H0QSwqXxYtQuQT6feybBUpZys0u2L8p3IKS4RMTOnM8EIRD h2vdyWTkYk927icItGvtqP76uOewPtauAodCej4QRabpsR/rKpc+CV7WAN39M0p4OaKl8kLtjHzrB uIX8rAJm1mP68j+uBR6Mgimu4Ky/u5u8zhDZAHCI8fEe4kyGhUCXEK3S9Y9yzLzhVx8OY6I1gdwOk L1Kmz/ixq1TCp6BCL7yFVgP/c+HmY4xDPaB6oMs8XCABTZNtcl5TLz/vOlKSHNZ75NB6B4y9XBvA/ +YQIUNDEJulrTiuw9/Fw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1pjxst-003gA2-2V; Wed, 05 Apr 2023 07:48:47 +0000 Received: from mail-ed1-x534.google.com ([2a00:1450:4864:20::534]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1pjxsp-003g77-2m for linux-arm-kernel@lists.infradead.org; Wed, 05 Apr 2023 07:48:45 +0000 Received: by mail-ed1-x534.google.com with SMTP id h8so138930731ede.8 for ; Wed, 05 Apr 2023 00:48:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1680680916; x=1683272916; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=p8w6UmRm5BD0nU8eoGMSUY3i4YMySq+vccxjuBrdrGk=; b=Er4az+SUwSsypZMrqUPkmi0/nctg+5jtBJGX3n0itJa0VtzZQo1BvCHiFbGupUN3Vj AHZVkrQl62hYk3iS59Pl/ZlCqcg8pMQLr53FnGQhXp5D/YEjkkTZoFMvzOcwIqK3nAYn yyxf/+Pgp2dioV/6Q27rXufr5omEBU+UH3cPX1l3tOSGuNgEj3H1sVv/Irk1bmGD8suF AjqoWpzzNPUHwjPUgPZNI2NewN/ytIQ1LWMZ4lNR2zK+vrnTYNuL+Q+3znkzs3LRj7ia B7EYjfnmcdc80z22X6LKXzpMdjdF7PMtbII66DhdPksnvV3kLxXq5urPDETxKMC2JNKD ussg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680680916; x=1683272916; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=p8w6UmRm5BD0nU8eoGMSUY3i4YMySq+vccxjuBrdrGk=; b=LcyhhdjSNxHU/WpioRJDvBQL06YYJhSBN11ikvIUgjKrwXkLas/Xe+wsluwkp+e0wS BKhDPlk99nIx8aBNFusSPNLzcouLDY3MHADfG6ekIDvpFR+wc5WndoF3fdcuR+Xdr7QN KH5gP/rXQHiExA1Js2Y2ubFEEE04GzrRmFPPXlKOis9QM11GE1SNYHMgW+/JxISyEBwb KyadMrr08553bSslPQIUntPpEdA8bVrUOxYpqfVPOTd6kfrnQ2EDL02p3eSDE5j9miO6 wrZQ57cTDqTSOiyakQrmKlQBiHsjwtQa90Pj8GMgDBs8Ug8+mSdjmwVZYbYB0rIUoj8z GjAw== X-Gm-Message-State: AAQBX9eh+jmpzQtZoW9eRwfPVZDC6e5u0xUZk+24kYOwDsNjrq7S4jJH 8kBM2AIbB2Wxm3wyv8j2VA85mw== X-Google-Smtp-Source: AKy350agh4y8u6yqcvybPxa7k4bIoL0RiJQx9kT+gtSHfJM3UtSKuXTj5TsxcO5eYQ19NZXovee1uw== X-Received: by 2002:a17:907:6e25:b0:931:b34:4172 with SMTP id sd37-20020a1709076e2500b009310b344172mr2776628ejc.3.1680680915886; Wed, 05 Apr 2023 00:48:35 -0700 (PDT) Received: from google.com (64.227.90.34.bc.googleusercontent.com. [34.90.227.64]) by smtp.gmail.com with ESMTPSA id l15-20020a17090612cf00b009222a7192b4sm6969397ejb.30.2023.04.05.00.48.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Apr 2023 00:48:35 -0700 (PDT) Date: Wed, 5 Apr 2023 07:48:32 +0000 From: Quentin Perret To: Marc Zyngier Cc: David Dai , Oliver Upton , "Rafael J. Wysocki" , Viresh Kumar , Rob Herring , Krzysztof Kozlowski , Paolo Bonzini , Jonathan Corbet , James Morse , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Mark Rutland , Lorenzo Pieralisi , Sudeep Holla , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , kernel-team@android.com, linux-pm@vger.kernel.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev Subject: Re: [RFC PATCH 0/6] Improve VM DVFS and task placement behavior Message-ID: References: <20230330224348.1006691-1-davidai@google.com> <86sfdfv0e1.wl-maz@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <86sfdfv0e1.wl-maz@kernel.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230405_004843_902095_F44063E1 X-CRM114-Status: GOOD ( 39.49 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tuesday 04 Apr 2023 at 21:49:10 (+0100), Marc Zyngier wrote: > On Tue, 04 Apr 2023 20:43:40 +0100, > Oliver Upton wrote: > > > > Folks, > > > > On Thu, Mar 30, 2023 at 03:43:35PM -0700, David Dai wrote: > > > > > > > > > PCMark > > > Higher is better > > > +-------------------+----------+------------+--------+-------+--------+ > > > | Test Case (score) | Baseline | Hypercall | %delta | MMIO | %delta | > > > +-------------------+----------+------------+--------+-------+--------+ > > > | Weighted Total | 6136 | 7274 | +19% | 6867 | +12% | > > > +-------------------+----------+------------+--------+-------+--------+ > > > | Web Browsing | 5558 | 6273 | +13% | 6035 | +9% | > > > +-------------------+----------+------------+--------+-------+--------+ > > > | Video Editing | 4921 | 5221 | +6% | 5167 | +5% | > > > +-------------------+----------+------------+--------+-------+--------+ > > > | Writing | 6864 | 8825 | +29% | 8529 | +24% | > > > +-------------------+----------+------------+--------+-------+--------+ > > > | Photo Editing | 7983 | 11593 | +45% | 10812 | +35% | > > > +-------------------+----------+------------+--------+-------+--------+ > > > | Data Manipulation | 5814 | 6081 | +5% | 5327 | -8% | > > > +-------------------+----------+------------+--------+-------+--------+ > > > > > > PCMark Performance/mAh > > > Higher is better > > > +-----------+----------+-----------+--------+------+--------+ > > > | | Baseline | Hypercall | %delta | MMIO | %delta | > > > +-----------+----------+-----------+--------+------+--------+ > > > | Score/mAh | 79 | 88 | +11% | 83 | +7% | > > > +-----------+----------+-----------+--------+------+--------+ > > > > > > Roblox > > > Higher is better > > > +-----+----------+------------+--------+-------+--------+ > > > | | Baseline | Hypercall | %delta | MMIO | %delta | > > > +-----+----------+------------+--------+-------+--------+ > > > | FPS | 18.25 | 28.66 | +57% | 24.06 | +32% | > > > +-----+----------+------------+--------+-------+--------+ > > > > > > Roblox Frames/mAh > > > Higher is better > > > +------------+----------+------------+--------+--------+--------+ > > > | | Baseline | Hypercall | %delta | MMIO | %delta | > > > +------------+----------+------------+--------+--------+--------+ > > > | Frames/mAh | 91.25 | 114.64 | +26% | 103.11 | +13% | > > > +------------+----------+------------+--------+--------+--------+ > > > > > > > > > Next steps: > > > =========== > > > We are continuing to look into communication mechanisms other than > > > hypercalls that are just as/more efficient and avoid switching into the VMM > > > userspace. Any inputs in this regard are greatly appreciated. > > > > We're highly unlikely to entertain such an interface in KVM. > > > > The entire feature is dependent on pinning vCPUs to physical cores, for which > > userspace is in the driver's seat. That is a well established and documented > > policy which can be seen in the way we handle heterogeneous systems and > > vPMU. > > > > Additionally, this bloats the KVM PV ABI with highly VMM-dependent interfaces > > that I would not expect to benefit the typical user of KVM. > > > > Based on the data above, it would appear that the userspace implementation is > > in the same neighborhood as a KVM-based implementation, which only further > > weakens the case for moving this into the kernel. > > > > I certainly can appreciate the motivation for the series, but this feature > > should be in userspace as some form of a virtual device. > > +1 on all of the above. And I concur with all the above as well. Putting this in the kernel is not an obvious fit at all as that requires a number of assumptions about the VMM. As Oliver pointed out, the guest topology, and how it maps to the host topology (vcpu pinning etc) is very much a VMM policy decision and will be particularly important to handle guest frequency requests correctly. In addition to that, the VMM's software architecture may have an impact. Crosvm for example does device emulation in separate processes for security reasons, so it is likely that adjusting the scheduling parameters ('util_guest', uclamp, or else) only for the vCPU thread that issues frequency requests will be sub-optimal for performance, we may want to adjust those parameters for all the tasks that are on the critical path. And at an even higher level, assuming in the kernel a certain mapping of vCPU threads to host threads feels kinda wrong, this too is a host userspace policy decision I believe. Not that anybody in their right mind would want to do this, but I _think_ it would technically be feasible to serialize the execution of multiple vCPUs on the same host thread, at which point the util_guest thingy becomes entirely bogus. (I obviously don't want to conflate this use-case, it's just an example that shows the proposed abstraction in the series is not a perfect fit for the KVM userspace delegation model.) So +1 from me to move this as a virtual device of some kind. And if the extra cost of exiting all the way back to userspace is prohibitive (is it btw?), then we can try to work on that. Maybe something a la vhost can be done to optimize, I'll have a think. > The one thing I'd like to understand that the comment seems to imply > that there is a significant difference in overhead between a hypercall > and an MMIO. In my experience, both are pretty similar in cost for a > handling location (both in userspace or both in the kernel). MMIO > handling is a tiny bit more expensive due to a guaranteed TLB miss > followed by a walk of the in-kernel device ranges, but that's all. It > should hardly register. > > And if you really want some super-low latency, low overhead > signalling, maybe an exception is the wrong tool for the job. Shared > memory communication could be more appropriate. I presume some kind of signalling mechanism will be necessary to synchronously update host scheduling parameters in response to guest frequency requests, but if the volume of data requires it then a shared buffer + doorbell type of approach should do. Thinking about it, using SCMI over virtio would implement exactly that. Linux-as-a-guest already supports it IIRC, so possibly the problem being addressed in this series could be 'simply' solved using an SCMI backend in the VMM... Thanks, Quentin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel