* [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation
@ 2024-11-28 13:45 Keisuke Nishimura
2024-11-28 16:54 ` Marc Zyngier
2024-11-28 17:55 ` Oliver Upton
0 siblings, 2 replies; 8+ messages in thread
From: Keisuke Nishimura @ 2024-11-28 13:45 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu
Cc: linux-arm-kernel, kvmarm, Keisuke Nishimura
The xa_store() may fail because there is no guarantee that the cache_key
index is already used in its->translation_cache. This fix (1) resolves
the kref inconsistency on failure and (2) returns the error code.
Fixes: 8201d1028caa ("KVM: arm64: vgic-its: Maintain a translation cache per ITS")
Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr>
---
arch/arm64/kvm/vgic/vgic-its.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 198296933e7e..8f423857b7d2 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -555,7 +555,7 @@ static struct vgic_irq *vgic_its_check_cache(struct kvm *kvm, phys_addr_t db,
return irq;
}
-static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
+static int vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
u32 devid, u32 eventid,
struct vgic_irq *irq)
{
@@ -564,7 +564,11 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
/* Do not cache a directly injected interrupt */
if (irq->hw)
- return;
+ return 0;
+
+ old = xa_store(&its->translation_cache, cache_key, irq, GFP_KERNEL_ACCOUNT);
+ if (xa_is_err(old))
+ return xa_err(old);
/*
* The irq refcount is guaranteed to be nonzero while holding the
@@ -578,9 +582,10 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
* translation behind our back, ensure we don't leak a
* reference if that is the case.
*/
- old = xa_store(&its->translation_cache, cache_key, irq, GFP_KERNEL_ACCOUNT);
if (old)
vgic_put_irq(kvm, old);
+
+ return 0;
}
static void vgic_its_invalidate_cache(struct vgic_its *its)
@@ -618,6 +623,7 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
{
struct kvm_vcpu *vcpu;
struct its_ite *ite;
+ int ret;
if (!its->enabled)
return -EBUSY;
@@ -633,7 +639,9 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
if (!vgic_lpis_enabled(vcpu))
return -EBUSY;
- vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq);
+ ret = vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq);
+ if (ret)
+ return ret;
*irq = ite->irq;
return 0;
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation
2024-11-28 13:45 [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation Keisuke Nishimura
@ 2024-11-28 16:54 ` Marc Zyngier
2024-11-28 17:55 ` Oliver Upton
1 sibling, 0 replies; 8+ messages in thread
From: Marc Zyngier @ 2024-11-28 16:54 UTC (permalink / raw)
To: Keisuke Nishimura
Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
linux-arm-kernel, kvmarm
On Thu, 28 Nov 2024 13:45:34 +0000,
Keisuke Nishimura <keisuke.nishimura@inria.fr> wrote:
>
> The xa_store() may fail because there is no guarantee that the cache_key
> index is already used in its->translation_cache. This fix (1) resolves
> the kref inconsistency on failure and (2) returns the error code.
Please describe the failure mode. Under which circumstances can this
fail?
>
> Fixes: 8201d1028caa ("KVM: arm64: vgic-its: Maintain a translation cache per ITS")
> Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr>
> ---
> arch/arm64/kvm/vgic/vgic-its.c | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
> index 198296933e7e..8f423857b7d2 100644
> --- a/arch/arm64/kvm/vgic/vgic-its.c
> +++ b/arch/arm64/kvm/vgic/vgic-its.c
> @@ -555,7 +555,7 @@ static struct vgic_irq *vgic_its_check_cache(struct kvm *kvm, phys_addr_t db,
> return irq;
> }
>
> -static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
> +static int vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
> u32 devid, u32 eventid,
> struct vgic_irq *irq)
> {
> @@ -564,7 +564,11 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>
> /* Do not cache a directly injected interrupt */
> if (irq->hw)
> - return;
> + return 0;
> +
> + old = xa_store(&its->translation_cache, cache_key, irq, GFP_KERNEL_ACCOUNT);
> + if (xa_is_err(old))
> + return xa_err(old);
Accessing the translation cache before the assert that checks the ITS
lock is on its own pretty dodgy...
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation
2024-11-28 13:45 [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation Keisuke Nishimura
2024-11-28 16:54 ` Marc Zyngier
@ 2024-11-28 17:55 ` Oliver Upton
2024-11-28 19:04 ` Keisuke Nishimura
1 sibling, 1 reply; 8+ messages in thread
From: Oliver Upton @ 2024-11-28 17:55 UTC (permalink / raw)
To: Keisuke Nishimura
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
linux-arm-kernel, kvmarm
On Thu, Nov 28, 2024 at 02:45:34PM +0100, Keisuke Nishimura wrote:
> The xa_store() may fail because there is no guarantee that the cache_key
> index is already used in its->translation_cache. This fix (1) resolves
> the kref inconsistency on failure and (2) returns the error code.
xa_store() doesn't fail if an entry is already present at the specified
index. It returns the old entry, which is why we have a vgic_put_irq()
on the "error" path.
Genuine error handling definitely is missing here, but that would only
happen if the xarray library failed to allocate (-ENOMEM) or if the
xarray itself is broken beyond repair (-EINVAL).
> Fixes: 8201d1028caa ("KVM: arm64: vgic-its: Maintain a translation cache per ITS")
> Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr>
> ---
> arch/arm64/kvm/vgic/vgic-its.c | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
> index 198296933e7e..8f423857b7d2 100644
> --- a/arch/arm64/kvm/vgic/vgic-its.c
> +++ b/arch/arm64/kvm/vgic/vgic-its.c
> @@ -555,7 +555,7 @@ static struct vgic_irq *vgic_its_check_cache(struct kvm *kvm, phys_addr_t db,
> return irq;
> }
>
> -static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
> +static int vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
> u32 devid, u32 eventid,
> struct vgic_irq *irq)
This was deliberately made a void return. The entire translation cache
is opportunistic and not required for functional correctness. Nothing
breaks if we fail to insert an entry for, say, a failed memory
allocation.
It would be extremely helpful if you could share the steps to reproduce
the error you observe.
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation
2024-11-28 17:55 ` Oliver Upton
@ 2024-11-28 19:04 ` Keisuke Nishimura
2024-11-29 7:20 ` Oliver Upton
0 siblings, 1 reply; 8+ messages in thread
From: Keisuke Nishimura @ 2024-11-28 19:04 UTC (permalink / raw)
To: Oliver Upton
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
linux-arm-kernel, kvmarm
Hello Marc and Oliver,
Thank you for the replies.
On 28/11/2024 18:55, Oliver Upton wrote:
> On Thu, Nov 28, 2024 at 02:45:34PM +0100, Keisuke Nishimura wrote:
>> The xa_store() may fail because there is no guarantee that the cache_key
>> index is already used in its->translation_cache. This fix (1) resolves
>> the kref inconsistency on failure and (2) returns the error code.
>
> xa_store() doesn't fail if an entry is already present at the specified
> index. It returns the old entry, which is why we have a vgic_put_irq()
> on the "error" path.
Sure. But to my understanding, this could be the first store to
its->translation_cache with cache_key, and that is why old can be NULL? I am not
very familiar with this code, so please correct me if I am wrong.
>
> Genuine error handling definitely is missing here, but that would only
> happen if the xarray library failed to allocate (-ENOMEM) or if the
> xarray itself is broken beyond repair (-EINVAL).
>
>> Fixes: 8201d1028caa ("KVM: arm64: vgic-its: Maintain a translation cache per ITS")
>> Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr>
>> ---
>> arch/arm64/kvm/vgic/vgic-its.c | 16 ++++++++++++----
>> 1 file changed, 12 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
>> index 198296933e7e..8f423857b7d2 100644
>> --- a/arch/arm64/kvm/vgic/vgic-its.c
>> +++ b/arch/arm64/kvm/vgic/vgic-its.c
>> @@ -555,7 +555,7 @@ static struct vgic_irq *vgic_its_check_cache(struct kvm *kvm, phys_addr_t db,
>> return irq;
>> }
>>
>> -static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>> +static int vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>> u32 devid, u32 eventid,
>> struct vgic_irq *irq)
>
> This was deliberately made a void return. The entire translation cache
> is opportunistic and not required for functional correctness. Nothing
> breaks if we fail to insert an entry for, say, a failed memory
> allocation.
If my point above is correct, in the next version, we can ensure no memory error
here by calling xa_reserve() in advance. Or, we can ignore the error, and just
make the kref consistent.
>
> It would be extremely helpful if you could share the steps to reproduce
> the error you observe.
>
I encountered this when I studied the usage of xa_*. After some investigation to
confirm that the call of xa_store() might be the first store with cache_key, I
sent this patch. (I apologize if I missed anything.)
Thank you,
Keisuke
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation
2024-11-28 19:04 ` Keisuke Nishimura
@ 2024-11-29 7:20 ` Oliver Upton
2024-11-29 11:30 ` Keisuke Nishimura
0 siblings, 1 reply; 8+ messages in thread
From: Oliver Upton @ 2024-11-29 7:20 UTC (permalink / raw)
To: Keisuke Nishimura
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
linux-arm-kernel, kvmarm
On Thu, Nov 28, 2024 at 08:04:01PM +0100, Keisuke Nishimura wrote:
> Hello Marc and Oliver,
>
> Thank you for the replies.
>
> On 28/11/2024 18:55, Oliver Upton wrote:
> > On Thu, Nov 28, 2024 at 02:45:34PM +0100, Keisuke Nishimura wrote:
> > > The xa_store() may fail because there is no guarantee that the cache_key
> > > index is already used in its->translation_cache. This fix (1) resolves
> > > the kref inconsistency on failure and (2) returns the error code.
> >
> > xa_store() doesn't fail if an entry is already present at the specified
> > index. It returns the old entry, which is why we have a vgic_put_irq()
> > on the "error" path.
>
> Sure. But to my understanding, this could be the first store to
> its->translation_cache with cache_key, and that is why old can be NULL? I am
> not very familiar with this code, so please correct me if I am wrong.
So we take a reference on @irq to represent the pointer to it that's
stored in the translation cache. There are 3 possible outcomes after the
store:
- Store succeeds and no pre-existing entry.
- Store succeeds and there's a pre-existing entry. put the reference on
@old since it was evicted from the translation cache
- Store fails because of a failed memory allocation / error in xarray
library.
We don't handle #3, and the correct thing to do in this case is to put
@irq since it was never made visible in the translation cache.
So I think we'd want to do something similar to the following.
--
Thanks,
Oliver
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index f4c4494645c3..fb96802799c6 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -608,12 +608,22 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
lockdep_assert_held(&its->its_lock);
vgic_get_irq_kref(irq);
+ old = xa_store(&its->translation_cache, cache_key, irq, GFP_KERNEL_ACCOUNT);
+
+ /*
+ * Put the reference taken on @irq if the store fails. Intentionally do
+ * not return the error as the translation cache is best effort.
+ */
+ if (xa_is_err(old)) {
+ vgic_put_irq(kvm, irq);
+ return;
+ }
+
/*
* We could have raced with another CPU caching the same
* translation behind our back, ensure we don't leak a
* reference if that is the case.
*/
- old = xa_store(&its->translation_cache, cache_key, irq, GFP_KERNEL_ACCOUNT);
if (old)
vgic_put_irq(kvm, old);
}
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation
2024-11-29 7:20 ` Oliver Upton
@ 2024-11-29 11:30 ` Keisuke Nishimura
2024-11-29 16:33 ` Oliver Upton
0 siblings, 1 reply; 8+ messages in thread
From: Keisuke Nishimura @ 2024-11-29 11:30 UTC (permalink / raw)
To: Oliver Upton
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
linux-arm-kernel, kvmarm
On 29/11/2024 08:20, Oliver Upton wrote:
> On Thu, Nov 28, 2024 at 08:04:01PM +0100, Keisuke Nishimura wrote:
>> Hello Marc and Oliver,
>>
>> Thank you for the replies.
>>
>> On 28/11/2024 18:55, Oliver Upton wrote:
>>> On Thu, Nov 28, 2024 at 02:45:34PM +0100, Keisuke Nishimura wrote:
>>>> The xa_store() may fail because there is no guarantee that the cache_key
>>>> index is already used in its->translation_cache. This fix (1) resolves
>>>> the kref inconsistency on failure and (2) returns the error code.
>>>
>>> xa_store() doesn't fail if an entry is already present at the specified
>>> index. It returns the old entry, which is why we have a vgic_put_irq()
>>> on the "error" path.
>>
>> Sure. But to my understanding, this could be the first store to
>> its->translation_cache with cache_key, and that is why old can be NULL? I am
>> not very familiar with this code, so please correct me if I am wrong.
>
> So we take a reference on @irq to represent the pointer to it that's
> stored in the translation cache. There are 3 possible outcomes after the
> store:
>
> - Store succeeds and no pre-existing entry.
>
> - Store succeeds and there's a pre-existing entry. put the reference on
> @old since it was evicted from the translation cache
>
> - Store fails because of a failed memory allocation / error in xarray
> library.
>
> We don't handle #3, and the correct thing to do in this case is to put
> @irq since it was never made visible in the translation cache.
>
> So I think we'd want to do something similar to the following.
>
Hello,
Thanks for the comment. xa_reserve() can allocate the memory in advance and
ensure the success of the subsequent xa_store(). FYI, the document of
xa_reserve() says:
> Ensures there is somewhere to store an entry at index in the array. If there
is already something stored at index, this function does nothing. If there was
nothing there, the entry is marked as reserved. Loading from a reserved entry
returns a NULL pointer.
I am thinking of calling it before the reference in v2. Here is the suggestion:
---
arch/arm64/kvm/vgic/vgic-its.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 198296933e7e..1422812ad6d7 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -561,11 +561,20 @@ static void vgic_its_cache_translation(struct kvm *kvm,
struct vgic_its *its,
{
unsigned long cache_key = vgic_its_cache_key(devid, eventid);
struct vgic_irq *old;
+ int ret;
/* Do not cache a directly injected interrupt */
if (irq->hw)
return;
+ /*
+ * Ensure the following xa_store() will not fail. Intentionally do
+ * not return the error as the translation cache is best effort.
+ */
+ ret = xa_reserve(&its->translation_cache, cache_key, GFP_KERNEL_ACCOUNT);
+ if (ret)
+ return;
+
/*
* The irq refcount is guaranteed to be nonzero while holding the
* its_lock, as the ITE (and the reference it holds) cannot be freed.
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation
2024-11-29 11:30 ` Keisuke Nishimura
@ 2024-11-29 16:33 ` Oliver Upton
2024-11-29 16:59 ` Keisuke Nishimura
0 siblings, 1 reply; 8+ messages in thread
From: Oliver Upton @ 2024-11-29 16:33 UTC (permalink / raw)
To: Keisuke Nishimura
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
linux-arm-kernel, kvmarm
On Fri, Nov 29, 2024 at 12:30:52PM +0100, Keisuke Nishimura wrote:
> On 29/11/2024 08:20, Oliver Upton wrote:
> > - Store fails because of a failed memory allocation / error in xarray
> > library.
> >
> > We don't handle #3, and the correct thing to do in this case is to put
> > @irq since it was never made visible in the translation cache.
> >
> > So I think we'd want to do something similar to the following.
> >
>
> Hello,
>
> Thanks for the comment. xa_reserve() can allocate the memory in advance and
> ensure the success of the subsequent xa_store().
Yes, but xa_store() can fail for other reasons than a failed memory
allocation. Which is why I recommended adding an explicit error handling
path to cover *all* failures to store in the xarray.
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation
2024-11-29 16:33 ` Oliver Upton
@ 2024-11-29 16:59 ` Keisuke Nishimura
0 siblings, 0 replies; 8+ messages in thread
From: Keisuke Nishimura @ 2024-11-29 16:59 UTC (permalink / raw)
To: Oliver Upton
Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
linux-arm-kernel, kvmarm
On 29/11/2024 17:33, Oliver Upton wrote:
> On Fri, Nov 29, 2024 at 12:30:52PM +0100, Keisuke Nishimura wrote:
>> On 29/11/2024 08:20, Oliver Upton wrote:
>>> - Store fails because of a failed memory allocation / error in xarray
>>> library.
>>>
>>> We don't handle #3, and the correct thing to do in this case is to put
>>> @irq since it was never made visible in the translation cache.
>>>
>>> So I think we'd want to do something similar to the following.
>>>
>>
>> Hello,
>>
>> Thanks for the comment. xa_reserve() can allocate the memory in advance and
>> ensure the success of the subsequent xa_store().
>
> Yes, but xa_store() can fail for other reasons than a failed memory
> allocation. Which is why I recommended adding an explicit error handling
> path to cover *all* failures to store in the xarray.
>
Okay. Then, in the next version, I will send the change you suggested in the
previous mail:)
thanks,
Keisuke
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-11-29 17:01 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-28 13:45 [PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation Keisuke Nishimura
2024-11-28 16:54 ` Marc Zyngier
2024-11-28 17:55 ` Oliver Upton
2024-11-28 19:04 ` Keisuke Nishimura
2024-11-29 7:20 ` Oliver Upton
2024-11-29 11:30 ` Keisuke Nishimura
2024-11-29 16:33 ` Oliver Upton
2024-11-29 16:59 ` Keisuke Nishimura
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).