From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 99D2D12FF99; Thu, 25 Jan 2024 18:07:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706206046; cv=none; b=d476+GvPQMjqQEonv/iU2Pcpm915reeF55P5Bzgwh9sUOGimWKroJoPc8DIQ+5UgWih7w2u7uJS1MxT4z2IpNUG5xYbZm5k9CnsfptsdotQ+fzv/aXEQF7hrSAIYLUf2uYtop28xZsq9/WAmSFvn7cCqSgpinho1gmlkU72WPew= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706206046; c=relaxed/simple; bh=BTNNy1gBZiJJXsTaanPiHvaJIc0pJi4N75tXXdwO1F0=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=bXzMrLIJMRJLp2cQpJeF4mVdzED7x5HUMl8hLC87p1fwKAEFAf54lUw2H1aJDEB5anNEJg4zar6RyRTdO57mrkBNF3Th83xqKzbSkB6q6/fT9GZtXsREk+odnJ5X9WdqFA7yg0V5uXfu+sX1cFpB2XGdoPmHlbeWEEySh14Z0mM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=vJaCoG6j; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="vJaCoG6j" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1B143C433C7; Thu, 25 Jan 2024 18:07:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706206046; bh=BTNNy1gBZiJJXsTaanPiHvaJIc0pJi4N75tXXdwO1F0=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=vJaCoG6jEB71cBRYslW6oiO/6EnEI+B6Ps4Bcej+01jbNGTIsU13ql2ncrNKvyVjR DWVNAHwTDOyZjiWkK/UxPvNVLNG7RO5Xj+O0Brkf8ZlEuFH21eQ7PXEqo2SXM8x1/P iKFGf4HApNXZiqs7Tjvh0L/1DPsNluie/280PJTwcC8XV/gZ7VlZnYmZsx5828aSc7 wuvNlAG0cyrA3INZVHrlKeo7nSPLq9xmypa52+2WtGjZ9nhAjNj1CbDO1ZCLQi9xgw P/N34SOhy0S8aO6fg17gCt9LoyBuLSrrXiLvCDjWkJnJ8jl86IJfkteLOqys9cetns t5nK3EZCsjNRw== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1rT48J-00Eotr-K3; Thu, 25 Jan 2024 18:07:23 +0000 Date: Thu, 25 Jan 2024 18:07:23 +0000 Message-ID: <86y1cd70dg.wl-maz@kernel.org> From: Marc Zyngier To: Oliver Upton Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org, James Morse , Suzuki K Poulose , Zenghui Yu , Raghavendra Rao Ananta , Jing Zhang Subject: Re: [PATCH 12/15] KVM: arm64: vgic-its: Pick cache victim based on usage count In-Reply-To: References: <20240124204909.105952-1-oliver.upton@linux.dev> <20240124204909.105952-13-oliver.upton@linux.dev> <861qa58yy0.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: oliver.upton@linux.dev, kvmarm@lists.linux.dev, kvm@vger.kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, rananta@google.com, jingzhangos@google.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Thu, 25 Jan 2024 15:34:31 +0000, Oliver Upton wrote: > > On Thu, Jan 25, 2024 at 10:55:19AM +0000, Marc Zyngier wrote: > > On Wed, 24 Jan 2024 20:49:06 +0000, Oliver Upton wrote: > > [...] > > > > +static struct vgic_translation_cache_entry *vgic_its_cache_victim(struct vgic_dist *dist) > > > +{ > > > + struct vgic_translation_cache_entry *cte, *victim = NULL; > > > + u64 min, tmp; > > > + > > > + /* > > > + * Find the least used cache entry since the last cache miss, preferring > > > + * older entries in the case of a tie. Note that usage accounting is > > > + * deliberately non-atomic, so this is all best-effort. > > > + */ > > > + list_for_each_entry(cte, &dist->lpi_translation_cache, entry) { > > > + if (!cte->irq) > > > + return cte; > > > + > > > + tmp = atomic64_xchg_relaxed(&cte->usage_count, 0); > > > + if (!victim || tmp <= min) { > > > > min is not initialised until after the first round. Not great. How > > comes the compiler doesn't spot this? > > min never gets read on the first iteration, since victim is known to be > NULL. Happy to initialize it though to keep this more ovbviously > sane. Ah, gotcha! Completely missed that. Sorry about the noise. > > > > + victim = cte; > > > + min = tmp; > > > + } > > > + } > > > > So this resets all the counters on each search for a new insertion? > > Seems expensive, specially on large VMs (512 * 16 = up to 8K SWP > > instructions in a tight loop, and I'm not even mentioning the fun > > without LSE). I can at least think of a box that will throw its > > interconnect out of the pram it tickled that way. > > Well, each cache eviction after we hit the cache limit. I wrote this up > to have _something_ that allowed the rculist conversion to later come > back to rework futher, but that obviously didn't happen. > > > I'd rather the new cache entry inherits the max of the current set, > > making it a lot cheaper. We can always detect the overflow and do a > > full invalidation in that case (worse case -- better options exist). > > Yeah, I like your suggested approach. I'll probably build a bit on top > of that. > > > > + > > > + return victim; > > > +} > > > + > > > static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its, > > > u32 devid, u32 eventid, > > > struct vgic_irq *irq) > > > @@ -645,9 +664,12 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its, > > > goto out; > > > > > > if (dist->lpi_cache_count >= vgic_its_max_cache_size(kvm)) { > > > - /* Always reuse the last entry (LRU policy) */ > > > - victim = list_last_entry(&dist->lpi_translation_cache, > > > - typeof(*cte), entry); > > > + victim = vgic_its_cache_victim(dist); > > > + if (WARN_ON_ONCE(!victim)) { > > > + victim = new; > > > + goto out; > > > + } > > > > I don't understand how this could happen. It sort of explains the > > oddity I was mentioning earlier, but I don't think we need this > > complexity. > > The only way it could actually happen is if a bug were introduced where > lpi_cache_count is somehow nonzero but the list is empty. But yeah, we > can dump this and assume we find a victim, which ought to always be > true. Right, that was my impression as well, and I couldn't find a way to fail it. Thanks, M. -- Without deviation from the norm, progress is not possible.