From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B2673FE58 for ; Sat, 20 Apr 2024 19:08:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713640120; cv=none; b=fi8y0hKHLnS0vXPG92rQs6E7sDUZ5xirUIYaXq8x2sIOqMrWYvPoWEPiT2acMK/29ymPswq3YQ8gYnLOk0Snc+h6muBampVSzINur3wa72BeD9Rpaztodrr25eEAzWH1zaz5eYu423g3BKhrPFw4+8xCdQmaUC3FuoEZzVoAYjw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713640120; c=relaxed/simple; bh=WFL0kHFdG95wcA5hb20tEohy3akzzUxLXaZlNTjiH4Q=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=RUlalJnbSSIVRLXXW7XVIbQdDPJnTHcwP+Nov1/c5SnOyzbiH6y3L2Dd45VBEuayiWOKMUnQngDUL12s75rGbKXXfN15j2UIZqYsTaepm9RGCSNORmYb/WKs0psXeVIosykknYNZWJsuoO6ungwv/mPCpXbqHQO+Y1JGi77B+lk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=F2mj0bdd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="F2mj0bdd" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F2C9CC116B1; Sat, 20 Apr 2024 19:08:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1713640120; bh=WFL0kHFdG95wcA5hb20tEohy3akzzUxLXaZlNTjiH4Q=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=F2mj0bdd99AIxw4gJxYHowwdMHj4R0QkjFKXC+Nmva+LppbgQaVzrS7F4n3B1UBil mEQMWMBaT6LQUbkqR3tMjchkYVJLfNBkApn0et1tIW+3RtsjyCQ4XGJlCrnvphUdPD vhMkqEcU+vPlbcKes4fEYnXeErTC+ECJy3mwr8KVIM/z/IxisjnFUrBsl0XhtSdQs0 rGRDjSC/jjLOdmZ7ExqttI0yMtELilNT7NaBP58Kb/PtS7TEtKgFpwti2skMxMPX3a JkvJzcVD94F39AGcy1uuqJQjNJTMmREijZ86UD7XVSC+XXL6C6R72Wvw6rgjzbnpzS GpWYMeKlzltdg== Received: from sofa.misterjones.org ([185.219.108.64] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1ryG4j-006PJf-I8; Sat, 20 Apr 2024 20:08:37 +0100 Date: Sat, 20 Apr 2024 20:08:36 +0100 Message-ID: <87wmorrgwb.wl-maz@kernel.org> From: Marc Zyngier To: Oliver Upton Cc: kvmarm@lists.linux.dev, James Morse , Suzuki K Poulose , Zenghui Yu , Eric Auger Subject: Re: [PATCH v2 09/19] KVM: arm64: vgic-its: Maintain a translation cache per ITS In-Reply-To: <20240419223842.951452-10-oliver.upton@linux.dev> References: <20240419223842.951452-1-oliver.upton@linux.dev> <20240419223842.951452-10-oliver.upton@linux.dev> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: oliver.upton@linux.dev, kvmarm@lists.linux.dev, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, eric.auger@redhat.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Fri, 19 Apr 2024 23:38:32 +0100, Oliver Upton wrote: > > Within the context of a single ITS, it is possible to use an xarray to > cache the device ID & event ID translation to a particular irq > descriptor. Take advantage of this to build a translation cache capable > of fitting all valid translations for a given ITS. > > Signed-off-by: Oliver Upton > --- > arch/arm64/kvm/vgic/vgic-its.c | 26 +++++++++++++++++++++++++- > include/kvm/arm_vgic.h | 6 ++++++ > 2 files changed, 31 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c > index be17a53d16ef..fd576377c084 100644 > --- a/arch/arm64/kvm/vgic/vgic-its.c > +++ b/arch/arm64/kvm/vgic/vgic-its.c > @@ -530,6 +530,11 @@ static struct vgic_its *__vgic_doorbell_to_its(struct kvm *kvm, gpa_t db) > return iodev->its; > } > > +static unsigned long vgic_its_cache_key(u32 devid, u32 eventid) > +{ > + return (((unsigned long)devid) << 32) | eventid; > +} > + Although this is correct, you may want to use the fact that our EIDs/DIDs are limited to 16 bits, as advertised in GITS_TYPER. By having a more compact index, you'd get better performance from the xarray itself (this is alluded to in the documentation). > static struct vgic_irq *__vgic_its_check_cache(struct vgic_dist *dist, > phys_addr_t db, > u32 devid, u32 eventid) > @@ -583,6 +588,7 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its, > u32 devid, u32 eventid, > struct vgic_irq *irq) > { > + unsigned long cache_key = vgic_its_cache_key(devid, eventid); > struct vgic_dist *dist = &kvm->arch.vgic; > struct vgic_translation_cache_entry *cte; > unsigned long flags; > @@ -592,6 +598,9 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its, > if (irq->hw) > return; > > + if (xa_reserve_irq(&its->translation_cache, cache_key, GFP_KERNEL_ACCOUNT)) > + return; > + > raw_spin_lock_irqsave(&dist->lpi_list_lock, flags); > > if (unlikely(list_empty(&dist->lpi_translation_cache))) > @@ -624,6 +633,11 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its, > */ > lockdep_assert_held(&its->its_lock); > vgic_get_irq_kref(irq); > + /* > + * Get a second ref for the ITS' translation cache. This will > + * disappear. Is that for the *global* translation cache? Or this one? I guess there is one for each, but a clearer comment would help, even if this gets removed three patches down the line. > + */ > + vgic_get_irq_kref(irq); > > cte->db = db; > cte->devid = devid; > @@ -633,6 +647,7 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its, > /* Move the new translation to the head of the list */ > list_move(&cte->entry, &dist->lpi_translation_cache); > > + xa_store(&its->translation_cache, cache_key, irq, 0); > out: > raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags); > } > @@ -641,7 +656,8 @@ static void vgic_its_invalidate_cache(struct vgic_its *its) > { > struct vgic_dist *dist = &kvm->arch.vgic; > struct vgic_translation_cache_entry *cte; > - unsigned long flags; > + unsigned long flags, idx; > + struct vgic_irq *irq; > > raw_spin_lock_irqsave(&dist->lpi_list_lock, flags); > > @@ -658,6 +674,11 @@ static void vgic_its_invalidate_cache(struct vgic_its *its) > } > > raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags); > + > + xa_for_each(&its->translation_cache, idx, irq) { > + xa_erase(&its->translation_cache, idx); > + vgic_put_irq(its->dev->kvm, irq); > + } > } > > void vgic_its_invalidate_all_caches(struct kvm *kvm) > @@ -1967,6 +1988,7 @@ static int vgic_its_create(struct kvm_device *dev, u32 type) > > INIT_LIST_HEAD(&its->device_list); > INIT_LIST_HEAD(&its->collection_list); > + xa_init_flags(&its->translation_cache, XA_FLAGS_LOCK_IRQ); > > dev->kvm->arch.vgic.msis_require_devid = true; > dev->kvm->arch.vgic.has_its = true; > @@ -1997,6 +2019,8 @@ static void vgic_its_destroy(struct kvm_device *kvm_dev) > > vgic_its_free_device_list(kvm, its); > vgic_its_free_collection_list(kvm, its); > + vgic_its_invalidate_cache(its); > + xa_destroy(&its->translation_cache); > > mutex_unlock(&its->its_lock); > kfree(its); > diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h > index ac7f15ec1586..0b027ed71dac 100644 > --- a/include/kvm/arm_vgic.h > +++ b/include/kvm/arm_vgic.h > @@ -210,6 +210,12 @@ struct vgic_its { > struct mutex its_lock; > struct list_head device_list; > struct list_head collection_list; > + > + /* > + * Caches the (device_id, event_id) -> vgic_irq translation for > + * *deliverable* LPIs. Can you clarify what "deliverable" LPIs are? That's not an architectural term, and I'd rather you describe the conditions that allow the LPI to be cached (my guess is: mapped and enabled). > + */ > + struct xarray translation_cache; > }; > > struct vgic_state_iter; Thanks, M. -- Without deviation from the norm, progress is not possible.