From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7CE6B348C7D for ; Tue, 19 May 2026 14:35:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779201330; cv=none; b=bisq/Vr6zkuJsuugJ/z/mWh7uPxcUbyRmiZIt/9Y2a9/dEo9pV8y6BWmS5ln1KfXtYJWaCtseF2bhosA5MvpiyeCqnGyX/NQn9+H2rgh7p90SaJ036EvhzLLYQAmoJeAKsH/iYQLTX4gnHdgMTHJf3isQ0Gz64gD0+NmAf+Mut8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779201330; c=relaxed/simple; bh=Emx8XM8LjkQqO6xEIc3bZVmsae4EqUtaplU0iD/vhTE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type:Content-Disposition; b=JGMEnEItd3CKaRCnT7bvCALB6pzNUudkW0ahNpkyWtaLjDEQcBGwyhJ3aE1ikmIxUP8eFg2sCYx31PmfvsZSy4VRZHLrkvWyaBaWjQx5qK5PBu8tz+mTxqGl5U/PfchNjswYVS+DvThG9mlx81t5oiJGQx0b9qm6KXQ+nOOz76o= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.b=hXojF65M; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.b="hXojF65M" Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 615291F37; Tue, 19 May 2026 07:35:22 -0700 (PDT) Received: from devkitleo.cambridge.arm.com (devkitleo.cambridge.arm.com [10.1.196.90]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 297D73F7B4; Tue, 19 May 2026 07:35:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=arm.com; s=foss; t=1779201327; bh=Emx8XM8LjkQqO6xEIc3bZVmsae4EqUtaplU0iD/vhTE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hXojF65MeqCpiYD47pDwJizwwukRXtMNoBTi9Jjgf/eP4IcF3q7fq6HyPamI0ZRZ4 JNWku/v6gL3rQifYakYlugB+sxhd2uLxidEqlFPva1m+OZI2K4V6HO9kpYefnmIMNS rQ+GjXeaoQ7iRYQpiNFR4L5mre5TkHf6spvlfrss= From: Leonardo Bras To: Will Deacon Cc: Leonardo Bras , Oliver Upton , Marc Zyngier , Joey Gouly , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Fuad Tabba , Raghavendra Rao Ananta , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 1/2] KVM: arm64: Introduce S2 walker SKIP return options Date: Tue, 19 May 2026 15:35:19 +0100 Message-ID: X-Mailer: git-send-email 2.54.0 In-Reply-To: References: <20260515195904.2466381-1-leo.bras@arm.com> <20260515195904.2466381-2-leo.bras@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: 8bit On Tue, May 19, 2026 at 02:15:41PM +0100, Will Deacon wrote: > On Tue, May 19, 2026 at 01:56:48PM +0100, Leonardo Bras wrote: > > On Tue, May 19, 2026 at 01:43:37PM +0100, Will Deacon wrote: > > > > > I was wondering along similar lines, but maybe it would be useful just > > > > > to pass a maximum level to the walker logic? That feels like the most > > > > > general case without complicating the existing logic. > > > > > > > > This proposal seems simpler for me to understand, and indeed looks like a > > > > better solution than what I have proposed, taking care of the > > > > 'already split' case with better performance, as it don't even walk a > > > > single level-3 entry. > > > > > > > > On the 'splitting' case, it also works flawlessly if the memory is given in > > > > level-2 blocks. There is only one case that I would like to address here: > > > > > > > > - Memory given in level-1 blocks (say 1GB) > > > > - Walker flag says 'walk down to level-2 only' > > > > - Split Walker on level-1 will break page down to (up to) level-3 entries. > > > > - Walker will continue to be called on level-2 entries, even though it's > > > > not necessary. > > > > > > If you're only visiting leaves, why would it be called on the level-2 > > > table entries? > > > > > > > Because once the leaf is turned into a table by the splitting walker, it > > gets reloaded and walked. This is an excerpt of __kvm_pgtable_visit(): > > Sorry, I was musing about the semantics after adding something to limit > the maximum level. I don't dispute what the current code would do. > > > Example: > > - Split this level-1 leave: > > - Walker creates the whole structure up to given level (currently 3) > > - Walker returns, gets reloaded, table detected, go down on that one > > - Level 2 entries walked (which is unnecessary) > > > > Please let me know if I am misunderstanding something. > > I just don't grok why this would happen if we limited the maximum level > to '2' _and_ said we only wanted to visit the leaf entries. In that > case, I wouldn't expect to descend into any of the L2 table entries > (because that would imply going beyond level 2) and I wouldn't expect to > be called for the table entries either (because we're only interested in > leaves). Agree, if we specify to skip level-3 entries, it would only walk up to level-2 entries, but take above example in detail: - Split these level-1 leaves, up to level-3 leaves (regular) - INFO: kvm_pgtable_walk will call walker: - only up to level-2 entries (skip level-3) - only on leaf entries - Walk first level-1 leaf, calls walker - walker will split the level-1 leaf in level-3 leaves - walker return from that first level-1 leaf - level-1 leaf is reloaded as a table - level-2 entries of that table are also walked (unnecessary) - on each of the level-2 table entries, level-3 entries are skipped To avoid the unecessary walk of the level-2 entries above, we would need to specify 'skip level-2' that could be an issue if we have a mix of level-1 and level-2 leaves, as the level-2 leaves in that case would not be split. That's why I suggest something like "skip recently created table" as a flag as well, so we can guarantee no newly created table gets walked unecessarily. Please help me if I am missing something important. Thanks! Leo