linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
@ 2024-10-02 20:49 Marc Zyngier
  2024-10-02 22:17 ` Thomas Gleixner
  2024-10-22  7:45 ` Zenghui Yu
  0 siblings, 2 replies; 8+ messages in thread
From: Marc Zyngier @ 2024-10-02 20:49 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel; +Cc: Thomas Gleixner, Kunkun Jiang

Kunkun Jiang reports that there is a small window of opportunity for
userspace to force a change of affinity for a VPE while the VPE has
already been unmapped, but the corresponding doorbell interrupt still
visible in /proc/irq/.

Plug the race by checking the value of vmapp_count, which tracks whether
the VPE is mapped ot not, and returning an error in this case.

This involves making vmapp_count common to both GICv4.1 and its v4.0
ancestor.

Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/c182ece6-2ba0-ce4f-3404-dba7a3ab6c52@huawei.com
---
 drivers/irqchip/irq-gic-v3-its.c   | 18 ++++++++++++------
 include/linux/irqchip/arm-gic-v4.h |  4 +++-
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index fdec478ba5e7..ab597e74ba08 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -797,8 +797,8 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its,
 	its_encode_valid(cmd, desc->its_vmapp_cmd.valid);
 
 	if (!desc->its_vmapp_cmd.valid) {
+		alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count);
 		if (is_v4_1(its)) {
-			alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count);
 			its_encode_alloc(cmd, alloc);
 			/*
 			 * Unmapping a VPE is self-synchronizing on GICv4.1,
@@ -817,13 +817,13 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its,
 	its_encode_vpt_addr(cmd, vpt_addr);
 	its_encode_vpt_size(cmd, LPI_NRBITS - 1);
 
+	alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count);
+
 	if (!is_v4_1(its))
 		goto out;
 
 	vconf_addr = virt_to_phys(page_address(desc->its_vmapp_cmd.vpe->its_vm->vprop_page));
 
-	alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count);
-
 	its_encode_alloc(cmd, alloc);
 
 	/*
@@ -3806,6 +3806,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
 	struct cpumask *table_mask;
 	unsigned long flags;
 
+	/*
+	 * Check if we're racing against a VPE being destroyed, for
+	 * which we don't want to allow a VMOVP.
+	 */
+	if (!atomic_read(&vpe->vmapp_count))
+		return -EINVAL;
+
 	/*
 	 * Changing affinity is mega expensive, so let's be as lazy as
 	 * we can and only do it if we really have to. Also, if mapped
@@ -4463,9 +4470,8 @@ static int its_vpe_init(struct its_vpe *vpe)
 	raw_spin_lock_init(&vpe->vpe_lock);
 	vpe->vpe_id = vpe_id;
 	vpe->vpt_page = vpt_page;
-	if (gic_rdists->has_rvpeid)
-		atomic_set(&vpe->vmapp_count, 0);
-	else
+	atomic_set(&vpe->vmapp_count, 0);
+	if (!gic_rdists->has_rvpeid)
 		vpe->vpe_proxy_event = -1;
 
 	return 0;
diff --git a/include/linux/irqchip/arm-gic-v4.h b/include/linux/irqchip/arm-gic-v4.h
index ecabed6d3307..7f1f11a5e4e4 100644
--- a/include/linux/irqchip/arm-gic-v4.h
+++ b/include/linux/irqchip/arm-gic-v4.h
@@ -66,10 +66,12 @@ struct its_vpe {
 				bool	enabled;
 				bool	group;
 			}			sgi_config[16];
-			atomic_t vmapp_count;
 		};
 	};
 
+	/* Track the VPE being mapped */
+	atomic_t vmapp_count;
+
 	/*
 	 * Ensures mutual exclusion between affinity setting of the
 	 * vPE and vLPI operations using vpe->col_idx.
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
  2024-10-02 20:49 [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE Marc Zyngier
@ 2024-10-02 22:17 ` Thomas Gleixner
  2024-10-02 23:05   ` Marc Zyngier
  2024-10-22  7:45 ` Zenghui Yu
  1 sibling, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2024-10-02 22:17 UTC (permalink / raw)
  To: Marc Zyngier, linux-kernel, linux-arm-kernel; +Cc: Kunkun Jiang

On Wed, Oct 02 2024 at 21:49, Marc Zyngier wrote:
> Kunkun Jiang reports that there is a small window of opportunity for
> userspace to force a change of affinity for a VPE while the VPE has
> already been unmapped, but the corresponding doorbell interrupt still
> visible in /proc/irq/.
>
> Plug the race by checking the value of vmapp_count, which tracks whether
> the VPE is mapped ot not, and returning an error in this case.
>
> This involves making vmapp_count common to both GICv4.1 and its v4.0
> ancestor.
>
> Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Link: https://lore.kernel.org/r/c182ece6-2ba0-ce4f-3404-dba7a3ab6c52@huawei.com

I assume this wants a Fixes: tag and a cc: stable, no?

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
  2024-10-02 22:17 ` Thomas Gleixner
@ 2024-10-02 23:05   ` Marc Zyngier
  0 siblings, 0 replies; 8+ messages in thread
From: Marc Zyngier @ 2024-10-02 23:05 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, linux-arm-kernel, Kunkun Jiang

On Wed, 02 Oct 2024 23:17:02 +0100,
Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> On Wed, Oct 02 2024 at 21:49, Marc Zyngier wrote:
> > Kunkun Jiang reports that there is a small window of opportunity for
> > userspace to force a change of affinity for a VPE while the VPE has
> > already been unmapped, but the corresponding doorbell interrupt still
> > visible in /proc/irq/.
> >
> > Plug the race by checking the value of vmapp_count, which tracks whether
> > the VPE is mapped ot not, and returning an error in this case.
> >
> > This involves making vmapp_count common to both GICv4.1 and its v4.0
> > ancestor.
> >
> > Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Link: https://lore.kernel.org/r/c182ece6-2ba0-ce4f-3404-dba7a3ab6c52@huawei.com
> 
> I assume this wants a Fixes: tag and a cc: stable, no?

Unclear.

While this is clearly a bug, the architectural effects are not fatal,
and nothing goes really wrong. However, some implementations are
reporting this as a RAS error. That's a bit silly, because this isn't
indicative of HW rotting away, and only a sure way to shoot yourself
in the foot. That's the real bug IMO.

So if these people are really hung up on having this addressed in
prehistoric kernels, we can always add:

Fixes: 64edfaa9a234 ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMAPP")

which points to the commit that implements the infrastructure we're
relying on. GICv4.0, which predates the above by at least a couple of
years is also affected, but nobody really cares about that.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
  2024-10-02 20:49 [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE Marc Zyngier
  2024-10-02 22:17 ` Thomas Gleixner
@ 2024-10-22  7:45 ` Zenghui Yu
  2024-10-23  8:49   ` Marc Zyngier
  1 sibling, 1 reply; 8+ messages in thread
From: Zenghui Yu @ 2024-10-22  7:45 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner, Kunkun Jiang

Hi Marc,

On 2024/10/3 4:49, Marc Zyngier wrote:
> Kunkun Jiang reports that there is a small window of opportunity for
> userspace to force a change of affinity for a VPE while the VPE has
> already been unmapped, but the corresponding doorbell interrupt still
> visible in /proc/irq/.
> 
> Plug the race by checking the value of vmapp_count, which tracks whether
> the VPE is mapped ot not, and returning an error in this case.
> 
> This involves making vmapp_count common to both GICv4.1 and its v4.0
> ancestor.
> 
> Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Link: https://lore.kernel.org/r/c182ece6-2ba0-ce4f-3404-dba7a3ab6c52@huawei.com
> ---
>  drivers/irqchip/irq-gic-v3-its.c   | 18 ++++++++++++------
>  include/linux/irqchip/arm-gic-v4.h |  4 +++-
>  2 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index fdec478ba5e7..ab597e74ba08 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -797,8 +797,8 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its,
>  	its_encode_valid(cmd, desc->its_vmapp_cmd.valid);
>  
>  	if (!desc->its_vmapp_cmd.valid) {
> +		alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count);
>  		if (is_v4_1(its)) {
> -			alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count);
>  			its_encode_alloc(cmd, alloc);
>  			/*
>  			 * Unmapping a VPE is self-synchronizing on GICv4.1,
> @@ -817,13 +817,13 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its,
>  	its_encode_vpt_addr(cmd, vpt_addr);
>  	its_encode_vpt_size(cmd, LPI_NRBITS - 1);
>  
> +	alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count);
> +
>  	if (!is_v4_1(its))
>  		goto out;
>  
>  	vconf_addr = virt_to_phys(page_address(desc->its_vmapp_cmd.vpe->its_vm->vprop_page));
>  
> -	alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count);
> -
>  	its_encode_alloc(cmd, alloc);
>  
>  	/*
> @@ -3806,6 +3806,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	struct cpumask *table_mask;
>  	unsigned long flags;
>  
> +	/*
> +	 * Check if we're racing against a VPE being destroyed, for
> +	 * which we don't want to allow a VMOVP.
> +	 */
> +	if (!atomic_read(&vpe->vmapp_count))
> +		return -EINVAL;

We lazily map the vPE so that vmapp_count is likely to be 0 on GICv4.0
implementations with the ITSList feature. Seems that that implementation
is not affected by the reported race and we don't need to check
vmapp_count for that.

Testing rc4 on my 920 server triggers the WARN_ON() in vgic_v3_load().

void vgic_v3_load(struct kvm_vcpu *vcpu)
{
	WARN_ON(vgic_v4_load(vcpu));

Thanks,
Zenghui


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
  2024-10-22  7:45 ` Zenghui Yu
@ 2024-10-23  8:49   ` Marc Zyngier
  2024-10-23 13:51     ` Zenghui Yu
  0 siblings, 1 reply; 8+ messages in thread
From: Marc Zyngier @ 2024-10-23  8:49 UTC (permalink / raw)
  To: Zenghui Yu; +Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner, Kunkun Jiang

Hi Zenghui,

On Tue, 22 Oct 2024 08:45:17 +0100,
Zenghui Yu <yuzenghui@huawei.com> wrote:
> 
> Hi Marc,
> 
> On 2024/10/3 4:49, Marc Zyngier wrote:
> > Kunkun Jiang reports that there is a small window of opportunity for
> > userspace to force a change of affinity for a VPE while the VPE has
> > already been unmapped, but the corresponding doorbell interrupt still
> > visible in /proc/irq/.
> > 
> > Plug the race by checking the value of vmapp_count, which tracks whether
> > the VPE is mapped ot not, and returning an error in this case.
> > 
> > This involves making vmapp_count common to both GICv4.1 and its v4.0
> > ancestor.
> > 
> > Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Link: https://lore.kernel.org/r/c182ece6-2ba0-ce4f-3404-dba7a3ab6c52@huawei.com
> > ---
> >  drivers/irqchip/irq-gic-v3-its.c   | 18 ++++++++++++------
> >  include/linux/irqchip/arm-gic-v4.h |  4 +++-
> >  2 files changed, 15 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > index fdec478ba5e7..ab597e74ba08 100644
> > --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -797,8 +797,8 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its,
> >  	its_encode_valid(cmd, desc->its_vmapp_cmd.valid);
> >  
> >  	if (!desc->its_vmapp_cmd.valid) {
> > +		alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count);
> >  		if (is_v4_1(its)) {
> > -			alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count);
> >  			its_encode_alloc(cmd, alloc);
> >  			/*
> >  			 * Unmapping a VPE is self-synchronizing on GICv4.1,
> > @@ -817,13 +817,13 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its,
> >  	its_encode_vpt_addr(cmd, vpt_addr);
> >  	its_encode_vpt_size(cmd, LPI_NRBITS - 1);
> >  
> > +	alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count);
> > +
> >  	if (!is_v4_1(its))
> >  		goto out;
> >  
> >  	vconf_addr = virt_to_phys(page_address(desc->its_vmapp_cmd.vpe->its_vm->vprop_page));
> >  
> > -	alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count);
> > -
> >  	its_encode_alloc(cmd, alloc);
> >  
> >  	/*
> > @@ -3806,6 +3806,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
> >  	struct cpumask *table_mask;
> >  	unsigned long flags;
> >  
> > +	/*
> > +	 * Check if we're racing against a VPE being destroyed, for
> > +	 * which we don't want to allow a VMOVP.
> > +	 */
> > +	if (!atomic_read(&vpe->vmapp_count))
> > +		return -EINVAL;
> 
> We lazily map the vPE so that vmapp_count is likely to be 0 on GICv4.0
> implementations with the ITSList feature. Seems that that implementation
> is not affected by the reported race and we don't need to check
> vmapp_count for that.

Indeed, the ITSList guards the sending of VMOVP in that case, and we
avoid the original issue in that case. However, this still translates
in the doorbell being moved for no reason (see its_vpe_db_proxy_move).

How about something like this?

Thanks,

	M.

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index ab597e74ba08..ac8ed56f1e48 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3810,8 +3810,17 @@ static int its_vpe_set_affinity(struct irq_data *d,
 	 * Check if we're racing against a VPE being destroyed, for
 	 * which we don't want to allow a VMOVP.
 	 */
-	if (!atomic_read(&vpe->vmapp_count))
-		return -EINVAL;
+	if (!atomic_read(&vpe->vmapp_count)) {
+		if (gic_requires_eager_mapping())
+			return -EINVAL;
+
+		/*
+		 * If we lazily map the VPEs, this isn't an error, and
+		 * we exit cleanly.
+		 */
+		irq_data_update_effective_affinity(d, cpumask_of(cpu));
+		return IRQ_SET_MASK_OK_DONE;
+	}
 
 	/*
 	 * Changing affinity is mega expensive, so let's be as lazy as

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
  2024-10-23  8:49   ` Marc Zyngier
@ 2024-10-23 13:51     ` Zenghui Yu
  2024-10-23 14:23       ` Marc Zyngier
  0 siblings, 1 reply; 8+ messages in thread
From: Zenghui Yu @ 2024-10-23 13:51 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner, Kunkun Jiang

On 2024/10/23 16:49, Marc Zyngier wrote:
> Hi Zenghui,
> 
> On Tue, 22 Oct 2024 08:45:17 +0100,
> Zenghui Yu <yuzenghui@huawei.com> wrote:
> >
> > Hi Marc,
> >
> > On 2024/10/3 4:49, Marc Zyngier wrote:
> > > Kunkun Jiang reports that there is a small window of opportunity for
> > > userspace to force a change of affinity for a VPE while the VPE has
> > > already been unmapped, but the corresponding doorbell interrupt still
> > > visible in /proc/irq/.
> > >
> > > Plug the race by checking the value of vmapp_count, which tracks whether
> > > the VPE is mapped ot not, and returning an error in this case.
> > >
> > > This involves making vmapp_count common to both GICv4.1 and its v4.0
> > > ancestor.
> > >
> > > Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > Link: https://lore.kernel.org/r/c182ece6-2ba0-ce4f-3404-dba7a3ab6c52@huawei.com
> > > ---
> > >  drivers/irqchip/irq-gic-v3-its.c   | 18 ++++++++++++------
> > >  include/linux/irqchip/arm-gic-v4.h |  4 +++-
> > >  2 files changed, 15 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > > index fdec478ba5e7..ab597e74ba08 100644
> > > --- a/drivers/irqchip/irq-gic-v3-its.c
> > > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > > @@ -797,8 +797,8 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its,
> > >  	its_encode_valid(cmd, desc->its_vmapp_cmd.valid);
> > >  
> > >  	if (!desc->its_vmapp_cmd.valid) {
> > > +		alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count);
> > >  		if (is_v4_1(its)) {
> > > -			alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count);
> > >  			its_encode_alloc(cmd, alloc);
> > >  			/*
> > >  			 * Unmapping a VPE is self-synchronizing on GICv4.1,
> > > @@ -817,13 +817,13 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its,
> > >  	its_encode_vpt_addr(cmd, vpt_addr);
> > >  	its_encode_vpt_size(cmd, LPI_NRBITS - 1);
> > >  
> > > +	alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count);
> > > +
> > >  	if (!is_v4_1(its))
> > >  		goto out;
> > >  
> > >  	vconf_addr = virt_to_phys(page_address(desc->its_vmapp_cmd.vpe->its_vm->vprop_page));
> > >  
> > > -	alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count);
> > > -
> > >  	its_encode_alloc(cmd, alloc);
> > >  
> > >  	/*
> > > @@ -3806,6 +3806,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
> > >  	struct cpumask *table_mask;
> > >  	unsigned long flags;
> > >  
> > > +	/*
> > > +	 * Check if we're racing against a VPE being destroyed, for
> > > +	 * which we don't want to allow a VMOVP.
> > > +	 */
> > > +	if (!atomic_read(&vpe->vmapp_count))
> > > +		return -EINVAL;
> >
> > We lazily map the vPE so that vmapp_count is likely to be 0 on GICv4.0
> > implementations with the ITSList feature. Seems that that implementation
> > is not affected by the reported race and we don't need to check
> > vmapp_count for that.
> 
> Indeed, the ITSList guards the sending of VMOVP in that case, and we
> avoid the original issue in that case. However, this still translates
> in the doorbell being moved for no reason (see its_vpe_db_proxy_move).

Yup.

> How about something like this?

I'm pretty sure that the splat will disappear with that.

> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index ab597e74ba08..ac8ed56f1e48 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3810,8 +3810,17 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	 * Check if we're racing against a VPE being destroyed, for
>  	 * which we don't want to allow a VMOVP.
>  	 */
> -	if (!atomic_read(&vpe->vmapp_count))
> -		return -EINVAL;
> +	if (!atomic_read(&vpe->vmapp_count)) {
> +		if (gic_requires_eager_mapping())
> +			return -EINVAL;

Nitpick: why do we treat this as an error?

> +
> +		/*
> +		 * If we lazily map the VPEs, this isn't an error, and
> +		 * we exit cleanly.
> +		 */
> +		irq_data_update_effective_affinity(d, cpumask_of(cpu));

@cpu is uninitialized to a sensible value at this point?

> +		return IRQ_SET_MASK_OK_DONE;
> +	}
>  
>  	/*
>  	 * Changing affinity is mega expensive, so let's be as lazy as

Thanks,
Zenghui


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
  2024-10-23 13:51     ` Zenghui Yu
@ 2024-10-23 14:23       ` Marc Zyngier
  2024-10-24 11:28         ` Zenghui Yu
  0 siblings, 1 reply; 8+ messages in thread
From: Marc Zyngier @ 2024-10-23 14:23 UTC (permalink / raw)
  To: Zenghui Yu; +Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner, Kunkun Jiang

On Wed, 23 Oct 2024 14:51:40 +0100,
Zenghui Yu <yuzenghui@huawei.com> wrote:
> 
> On 2024/10/23 16:49, Marc Zyngier wrote:
> > Hi Zenghui,
> > 
> > On Tue, 22 Oct 2024 08:45:17 +0100,
> > Zenghui Yu <yuzenghui@huawei.com> wrote:
> > >
> > > Hi Marc,
> > >
> > > On 2024/10/3 4:49, Marc Zyngier wrote:
> > > > Kunkun Jiang reports that there is a small window of opportunity for
> > > > userspace to force a change of affinity for a VPE while the VPE has
> > > > already been unmapped, but the corresponding doorbell interrupt still
> > > > visible in /proc/irq/.
> > > >
> > > > Plug the race by checking the value of vmapp_count, which tracks whether
> > > > the VPE is mapped ot not, and returning an error in this case.
> > > >
> > > > This involves making vmapp_count common to both GICv4.1 and its v4.0
> > > > ancestor.
> > > >
> > > > Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
> > > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > > Link: https://lore.kernel.org/r/c182ece6-2ba0-ce4f-3404-dba7a3ab6c52@huawei.com
> > > > ---
> > > >  drivers/irqchip/irq-gic-v3-its.c   | 18 ++++++++++++------
> > > >  include/linux/irqchip/arm-gic-v4.h |  4 +++-
> > > >  2 files changed, 15 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > > > index fdec478ba5e7..ab597e74ba08 100644
> > > > --- a/drivers/irqchip/irq-gic-v3-its.c
> > > > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > > > @@ -797,8 +797,8 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its,
> > > >  	its_encode_valid(cmd, desc->its_vmapp_cmd.valid);
> > > >  
> > > >  	if (!desc->its_vmapp_cmd.valid) {
> > > > +		alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count);
> > > >  		if (is_v4_1(its)) {
> > > > -			alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count);
> > > >  			its_encode_alloc(cmd, alloc);
> > > >  			/*
> > > >  			 * Unmapping a VPE is self-synchronizing on GICv4.1,
> > > > @@ -817,13 +817,13 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its,
> > > >  	its_encode_vpt_addr(cmd, vpt_addr);
> > > >  	its_encode_vpt_size(cmd, LPI_NRBITS - 1);
> > > >  
> > > > +	alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count);
> > > > +
> > > >  	if (!is_v4_1(its))
> > > >  		goto out;
> > > >  
> > > >  	vconf_addr = virt_to_phys(page_address(desc->its_vmapp_cmd.vpe->its_vm->vprop_page));
> > > >  
> > > > -	alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count);
> > > > -
> > > >  	its_encode_alloc(cmd, alloc);
> > > >  
> > > >  	/*
> > > > @@ -3806,6 +3806,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
> > > >  	struct cpumask *table_mask;
> > > >  	unsigned long flags;
> > > >  
> > > > +	/*
> > > > +	 * Check if we're racing against a VPE being destroyed, for
> > > > +	 * which we don't want to allow a VMOVP.
> > > > +	 */
> > > > +	if (!atomic_read(&vpe->vmapp_count))
> > > > +		return -EINVAL;
> > >
> > > We lazily map the vPE so that vmapp_count is likely to be 0 on GICv4.0
> > > implementations with the ITSList feature. Seems that that implementation
> > > is not affected by the reported race and we don't need to check
> > > vmapp_count for that.
> > 
> > Indeed, the ITSList guards the sending of VMOVP in that case, and we
> > avoid the original issue in that case. However, this still translates
> > in the doorbell being moved for no reason (see its_vpe_db_proxy_move).
> 
> Yup.
> 
> > How about something like this?
> 
> I'm pretty sure that the splat will disappear with that.
> 
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > index ab597e74ba08..ac8ed56f1e48 100644
> > --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -3810,8 +3810,17 @@ static int its_vpe_set_affinity(struct irq_data *d,
> >  	 * Check if we're racing against a VPE being destroyed, for
> >  	 * which we don't want to allow a VMOVP.
> >  	 */
> > -	if (!atomic_read(&vpe->vmapp_count))
> > -		return -EINVAL;
> > +	if (!atomic_read(&vpe->vmapp_count)) {
> > +		if (gic_requires_eager_mapping())
> > +			return -EINVAL;
> 
> Nitpick: why do we treat this as an error?

Because at this stage we can't update the affinity anymore, and I see
it as basic courtesy to let the caller know.

> 
> > +
> > +		/*
> > +		 * If we lazily map the VPEs, this isn't an error, and
> > +		 * we exit cleanly.
> > +		 */
> > +		irq_data_update_effective_affinity(d, cpumask_of(cpu));
> 
> @cpu is uninitialized to a sensible value at this point?

Ah! As usual, I wrote this on the train this morning, before having
had much coffee, and didn't even compile-test it. Here's an amended
patch, similarly untested.

If that works for you, I'll put that in a proper patch for Thomas to
merge.

Thanks,

	M.

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index ab597e74ba08e..52f625e07658c 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3810,8 +3810,18 @@ static int its_vpe_set_affinity(struct irq_data *d,
 	 * Check if we're racing against a VPE being destroyed, for
 	 * which we don't want to allow a VMOVP.
 	 */
-	if (!atomic_read(&vpe->vmapp_count))
-		return -EINVAL;
+	if (!atomic_read(&vpe->vmapp_count)) {
+		if (gic_requires_eager_mapping())
+			return -EINVAL;
+
+		/*
+		 * If we lazily map the VPEs, this isn't an error and
+		 * we can exit cleanly.
+		 */
+		cpu = cpumask_first(mask_val);
+		irq_data_update_effective_affinity(d, cpumask_of(cpu));
+		return IRQ_SET_MASK_OK_DONE;
+	}
 
 	/*
 	 * Changing affinity is mega expensive, so let's be as lazy as

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
  2024-10-23 14:23       ` Marc Zyngier
@ 2024-10-24 11:28         ` Zenghui Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Zenghui Yu @ 2024-10-24 11:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner, Kunkun Jiang

On 2024/10/23 22:23, Marc Zyngier wrote:
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index ab597e74ba08e..52f625e07658c 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3810,8 +3810,18 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	 * Check if we're racing against a VPE being destroyed, for
>  	 * which we don't want to allow a VMOVP.
>  	 */
> -	if (!atomic_read(&vpe->vmapp_count))
> -		return -EINVAL;
> +	if (!atomic_read(&vpe->vmapp_count)) {
> +		if (gic_requires_eager_mapping())
> +			return -EINVAL;
> +
> +		/*
> +		 * If we lazily map the VPEs, this isn't an error and
> +		 * we can exit cleanly.
> +		 */
> +		cpu = cpumask_first(mask_val);
> +		irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +		return IRQ_SET_MASK_OK_DONE;
> +	}
>  
>  	/*
>  	 * Changing affinity is mega expensive, so let's be as lazy as

Looks good, thanks!

Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-10-24 11:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-02 20:49 [PATCH] irqchip/gic-v4: Don't allow a VMOVP on a dying VPE Marc Zyngier
2024-10-02 22:17 ` Thomas Gleixner
2024-10-02 23:05   ` Marc Zyngier
2024-10-22  7:45 ` Zenghui Yu
2024-10-23  8:49   ` Marc Zyngier
2024-10-23 13:51     ` Zenghui Yu
2024-10-23 14:23       ` Marc Zyngier
2024-10-24 11:28         ` Zenghui Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).