From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 353FCC00140 for ; Tue, 26 Jul 2022 17:52:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=n3ooVRYeCaTTX9thMpfCl/Vd/mcD2cbAYkEmI8D7XGY=; b=gnVG2S7UGfdJVW wGPFCmPMU4e/0CI9tkrjh2qrmb4ucvQoksobIn9Kooez13rcuk9zVNTtQRhdzWAQaqNXvBLmZkox7 PBAWJthGU6dzUbPBELHOg40QpNKaB9z/WCxo0UuaKhkeB7gOI5uIDXMq8d5kc8WcerfEXaLgkDUmd XmYhczkONdCUBKBru4FzFEgvhAb/Idrei0smzFEw2Cpto2WEIOArYGu0L7hSQLf4GLuPZijkzdMMl bvnpXsJZeeX1lugzI4WF9Kt6/Df7NQz/7CQfMOxhjfTo8hEgwn/iwsUrJ9UJESsPevWF/JLy8ZBHS +djyV5wD6RgVp7rePeHQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oGOia-001nlC-Ch; Tue, 26 Jul 2022 17:51:40 +0000 Received: from out1.migadu.com ([2001:41d0:2:863f::]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oGOiV-001naz-1x for linux-arm-kernel@lists.infradead.org; Tue, 26 Jul 2022 17:51:38 +0000 Date: Tue, 26 Jul 2022 10:51:21 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1658857887; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OFOfoXl3zEfeVlgbwtYDBuwQbWx4XqMWyRDwVaLsCxU=; b=uhXixPrJ1v82f8m4M0HRE1A8VygfoC+9xjXbxlngwjCK6D0/IlLdav6Siom+xCTIcJFgPP SuguGjP8B59u8zFJ3+AqY0ASexsKWSdy/xHBLUkC/rqpvtT6Q+UYYQ2DMFYMcJrRLHmneJ majEyl3oEcA72Zo8te/D6EwqWTk/H0M= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Oliver Upton To: Alexandru Elisei Cc: Will Deacon , maz@kernel.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Subject: Re: KVM/arm64: SPE: Translate VA to IPA on a stage 2 fault instead of pinning VM memory Message-ID: References: <20220419141012.GB6143@willie-the-truck> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220726_105135_767544_01ABE06E X-CRM114-Status: GOOD ( 33.90 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Alex, On Mon, Jul 25, 2022 at 11:06:24AM +0100, Alexandru Elisei wrote: [...] > > A funkier approach might be to defer pinning of the buffer until the SPE is > > enabled and avoid pinning all of VM memory that way, although I can't > > immediately tell how flexible the architecture is in allowing you to cache > > the base/limit values. > > I was investigating this approach, and Mark raised a concern that I think > might be a showstopper. > > Let's consider this scenario: > > Initial conditions: guest at EL1, profiling disabled (PMBLIMITR_EL1.E = 0, > PMBSR_EL1.S = 0, PMSCR_EL1.{E0SPE,E1SPE} = {0,0}). > > 1. Guest programs the buffer and enables it (PMBLIMITR_EL1.E = 1). > 2. Guest programs SPE to enable profiling at **EL0** > (PMSCR_EL1.{E0SPE,E1SPE} = {1,0}). > 3. Guest changes the translation table entries for the buffer. The > architecture allows this. > 4. Guest does an ERET to EL0, thus enabling profiling. > > Since KVM cannot trap the ERET to EL0, it will be impossible for KVM to pin > the buffer at stage 2 when profiling gets enabled at EL0. Not saying we necessarily should, but this is possible with FGT no? > I can see two solutions here: > > a. Accept the limitation (and advertise it in the documentation) that if > someone wants to use SPE when running as a Linux guest, the kernel used by > the guest must not change the buffer translation table entries after the > buffer has been enabled (PMBLIMITR_EL1.E = 1). Linux already does that, so > running a Linux guest should not be a problem. I don't know how other OSes > do it (but I can find out). We could also phrase it that the buffer > translation table entries can be changed after enabling the buffer, but > only if profiling happens at EL1. But that sounds very arbitrary. > > b. Pin the buffer after the stage 2 DABT that SPE will report in the > situation above. This means that there is a blackout window, but will > happen only once after each time the guest reprograms the buffer. I don't > know if this is acceptable. We could say that this if this blackout window > is not acceptable, then the guest kernel shouldn't change the translation > table entries after enabling the buffer. > > Or drop the approach of pinning the buffer and go back to pinning the > entire memory of the VM. > > Any thoughts on this? I would very much prefer to try to pin only the > buffer. Doesn't pinning the buffer also imply pinning the stage 1 tables responsible for its translation as well? I agree that pinning the buffer is likely the best way forward as pinning the whole of guest memory is entirely impractical. I'm also a bit confused on how we would manage to un-pin memory on the way out with this. The guest is free to muck with the stage 1 and could cause the SPU to spew a bunch of stage 2 aborts if it wanted to be annoying. One way to tackle it would be to only allow a single root-to-target walk to be pinned by a vCPU at a time. Any time a new stage 2 abort comes from the SPU, we un-pin the old walk and pin the new one instead. Live migration also throws a wrench in this. IOW, there are still potential sources of blackout unattributable to guest manipulation of the SPU. Going to think on this some more.. -- Thanks, Oliver _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel