linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] KVM: arm64: vgic-its fixes
@ 2016-08-03 16:13 Christoffer Dall
  2016-08-03 16:13 ` [PATCH v2 1/3] KVM: arm64: vgic-its: Handle errors from vgic_add_lpi Christoffer Dall
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Christoffer Dall @ 2016-08-03 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

Here are three small fixes resulting from looking at the pull request of
kvmarm/next into kvm/next.

They basically deal with corner cases in the reference count handling and
atomicity issues.

Christoffer Dall (3):
  KVM: arm64: vgic-its: Handle errors from vgic_add_lpi
  KVM: arm64: vgic-its: Plug race in vgic_put_irq
  KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic

 virt/kvm/arm/vgic/vgic-its.c     | 18 ++++++++++++++----
 virt/kvm/arm/vgic/vgic-mmio-v3.c | 10 ++++++++--
 virt/kvm/arm/vgic/vgic.c         | 20 ++++++++++----------
 3 files changed, 32 insertions(+), 16 deletions(-)

-- 
2.9.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/3] KVM: arm64: vgic-its: Handle errors from vgic_add_lpi
  2016-08-03 16:13 [PATCH 0/3] KVM: arm64: vgic-its fixes Christoffer Dall
@ 2016-08-03 16:13 ` Christoffer Dall
  2016-08-08 11:00   ` Andre Przywara
  2016-08-03 16:13 ` [PATCH 2/3] KVM: arm64: vgic-its: Plug race in vgic_put_irq Christoffer Dall
  2016-08-03 16:13 ` [PATCH 3/3] KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic Christoffer Dall
  2 siblings, 1 reply; 13+ messages in thread
From: Christoffer Dall @ 2016-08-03 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

During low memory conditions, we could be dereferencing a NULL pointer
when vgic_add_lpi fails to allocate memory.

Consider for example this call sequence:

  vgic_its_cmd_handle_mapi
      itte->irq = vgic_add_lpi(kvm, lpi_nr);
          update_lpi_config(kvm, itte->irq, NULL);
              ret = kvm_read_guest(kvm, propbase + irq->intid
	                                             ^^^^
						     kaboom?

Instead, return an error pointer from vgic_add_lpi and check the return
value from its single caller.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
Changes since v1:
 - Don't errornously get an extra kref refernce for the struct vgic_irq
 - Don't rework the entire error handling of the function, but follow
   what Marc suggested he prefers based on his fixup patch.
 virt/kvm/arm/vgic/vgic-its.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 07411cf..424f7a5 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -51,7 +51,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 intid)
 
 	irq = kzalloc(sizeof(struct vgic_irq), GFP_KERNEL);
 	if (!irq)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	INIT_LIST_HEAD(&irq->lpi_list);
 	INIT_LIST_HEAD(&irq->ap_list);
@@ -693,10 +693,11 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
 	u32 device_id = its_cmd_get_deviceid(its_cmd);
 	u32 event_id = its_cmd_get_id(its_cmd);
 	u32 coll_id = its_cmd_get_collection(its_cmd);
-	struct its_itte *itte;
+	struct its_itte *itte, *new_itte = NULL;
 	struct its_device *device;
 	struct its_collection *collection, *new_coll = NULL;
 	int lpi_nr;
+	struct vgic_irq *irq;
 
 	device = find_its_device(its, device_id);
 	if (!device)
@@ -720,7 +721,7 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
 
 	itte = find_itte(its, device_id, event_id);
 	if (!itte) {
-		itte = kzalloc(sizeof(struct its_itte), GFP_KERNEL);
+		new_itte = itte = kzalloc(sizeof(struct its_itte), GFP_KERNEL);
 		if (!itte) {
 			if (new_coll)
 				vgic_its_free_collection(its, coll_id);
@@ -733,7 +734,16 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
 
 	itte->collection = collection;
 	itte->lpi = lpi_nr;
-	itte->irq = vgic_add_lpi(kvm, lpi_nr);
+
+	irq = vgic_add_lpi(kvm, lpi_nr);
+	if (IS_ERR(irq)) {
+		if (new_coll)
+			vgic_its_free_collection(its, coll_id);
+		kfree(new_itte);
+		return PTR_ERR(irq);
+	}
+	itte->irq = irq;
+
 	update_affinity_itte(kvm, itte);
 
 	/*
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] KVM: arm64: vgic-its: Plug race in vgic_put_irq
  2016-08-03 16:13 [PATCH 0/3] KVM: arm64: vgic-its fixes Christoffer Dall
  2016-08-03 16:13 ` [PATCH v2 1/3] KVM: arm64: vgic-its: Handle errors from vgic_add_lpi Christoffer Dall
@ 2016-08-03 16:13 ` Christoffer Dall
  2016-08-08 11:20   ` Andre Przywara
  2016-08-03 16:13 ` [PATCH 3/3] KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic Christoffer Dall
  2 siblings, 1 reply; 13+ messages in thread
From: Christoffer Dall @ 2016-08-03 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

Right now the following sequence of events can happen:

  1. Thread X calls vgic_put_irq
  2. Thread Y calls vgic_add_lpi
  3. Thread Y gets lpi_list_lock
  4. Thread X drops the ref count to 0 and blocks on lpi_list_lock
  5. Thread Y finds the irq via the lpi_list_lock, raises the ref
     count to 1, and release the lpi_list_lock.
  6. Thread X proceeds and frees the irq.

Avoid this by holding the spinlock around the kref_put.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic/vgic.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index e7aeac7..fb8c0ab 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -117,22 +117,22 @@ static void vgic_irq_release(struct kref *ref)
 
 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 {
-	struct vgic_dist *dist;
+	struct vgic_dist *dist = &kvm->arch.vgic;
 
 	if (irq->intid < VGIC_MIN_LPI)
 		return;
 
-	if (!kref_put(&irq->refcount, vgic_irq_release))
-		return;
-
-	dist = &kvm->arch.vgic;
-
 	spin_lock(&dist->lpi_list_lock);
-	list_del(&irq->lpi_list);
-	dist->lpi_list_count--;
-	spin_unlock(&dist->lpi_list_lock);
+	if (!kref_put(&irq->refcount, vgic_irq_release)) {
+		spin_unlock(&dist->lpi_list_lock);
+		return;
+	} else {
+		list_del(&irq->lpi_list);
+		dist->lpi_list_count--;
+		spin_unlock(&dist->lpi_list_lock);
 
-	kfree(irq);
+		kfree(irq);
+	}
 }
 
 /**
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
  2016-08-03 16:13 [PATCH 0/3] KVM: arm64: vgic-its fixes Christoffer Dall
  2016-08-03 16:13 ` [PATCH v2 1/3] KVM: arm64: vgic-its: Handle errors from vgic_add_lpi Christoffer Dall
  2016-08-03 16:13 ` [PATCH 2/3] KVM: arm64: vgic-its: Plug race in vgic_put_irq Christoffer Dall
@ 2016-08-03 16:13 ` Christoffer Dall
  2016-08-08 13:30   ` Andre Przywara
  2 siblings, 1 reply; 13+ messages in thread
From: Christoffer Dall @ 2016-08-03 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

There are two problems with the current implementation of the MMIO
handlers for the propbaser and pendbaser:

First, the write to the value itself is not guaranteed to be an atomic
64-bit write so two concurrent writes to the structure field could be
intermixed.

Second, because we do a read-modify-update operation without any
synchronization, if we have two 32-bit accesses to separate parts of the
register, we can loose one of them.

We can take the KVM mutex to synchronize accesses to these registers.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 virt/kvm/arm/vgic/vgic-mmio-v3.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index ff668e0..e38b7a0 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -306,16 +306,19 @@ static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
-	u64 propbaser = dist->propbaser;
+	u64 propbaser;
 
 	/* Storing a value with LPIs already enabled is undefined */
 	if (vgic_cpu->lpis_enabled)
 		return;
 
+	mutex_lock(&vcpu->kvm->lock);
+	propbaser = dist->propbaser;
 	propbaser = update_64bit_reg(propbaser, addr & 4, len, val);
 	propbaser = vgic_sanitise_propbaser(propbaser);
 
 	dist->propbaser = propbaser;
+	mutex_unlock(&vcpu->kvm->lock);
 }
 
 static unsigned long vgic_mmio_read_pendbase(struct kvm_vcpu *vcpu,
@@ -331,16 +334,19 @@ static void vgic_mmio_write_pendbase(struct kvm_vcpu *vcpu,
 				     unsigned long val)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
-	u64 pendbaser = vgic_cpu->pendbaser;
+	u64 pendbaser;
 
 	/* Storing a value with LPIs already enabled is undefined */
 	if (vgic_cpu->lpis_enabled)
 		return;
 
+	mutex_lock(&vcpu->kvm->lock);
+	pendbaser = vgic_cpu->pendbaser;
 	pendbaser = update_64bit_reg(pendbaser, addr & 4, len, val);
 	pendbaser = vgic_sanitise_pendbaser(pendbaser);
 
 	vgic_cpu->pendbaser = pendbaser;
+	mutex_unlock(&vcpu->kvm->lock);
 }
 
 /*
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 1/3] KVM: arm64: vgic-its: Handle errors from vgic_add_lpi
  2016-08-03 16:13 ` [PATCH v2 1/3] KVM: arm64: vgic-its: Handle errors from vgic_add_lpi Christoffer Dall
@ 2016-08-08 11:00   ` Andre Przywara
  2016-08-09 10:09     ` Christoffer Dall
  0 siblings, 1 reply; 13+ messages in thread
From: Andre Przywara @ 2016-08-08 11:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 03/08/16 17:13, Christoffer Dall wrote:
> During low memory conditions, we could be dereferencing a NULL pointer
> when vgic_add_lpi fails to allocate memory.
> 
> Consider for example this call sequence:
> 
>   vgic_its_cmd_handle_mapi
>       itte->irq = vgic_add_lpi(kvm, lpi_nr);

Ouch! Thanks for catching this unhandled error return!

>           update_lpi_config(kvm, itte->irq, NULL);
>               ret = kvm_read_guest(kvm, propbase + irq->intid
> 	                                             ^^^^
> 						     kaboom?
> 
> Instead, return an error pointer from vgic_add_lpi and check the return
> value from its single caller.
> 
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> ---
> Changes since v1:
>  - Don't errornously get an extra kref refernce for the struct vgic_irq
>  - Don't rework the entire error handling of the function, but follow
>    what Marc suggested he prefers based on his fixup patch.
>  virt/kvm/arm/vgic/vgic-its.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index 07411cf..424f7a5 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -51,7 +51,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 intid)
>  
>  	irq = kzalloc(sizeof(struct vgic_irq), GFP_KERNEL);
>  	if (!irq)
> -		return NULL;
> +		return ERR_PTR(-ENOMEM);
>  
>  	INIT_LIST_HEAD(&irq->lpi_list);
>  	INIT_LIST_HEAD(&irq->ap_list);
> @@ -693,10 +693,11 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
>  	u32 device_id = its_cmd_get_deviceid(its_cmd);
>  	u32 event_id = its_cmd_get_id(its_cmd);
>  	u32 coll_id = its_cmd_get_collection(its_cmd);
> -	struct its_itte *itte;
> +	struct its_itte *itte, *new_itte = NULL;
>  	struct its_device *device;
>  	struct its_collection *collection, *new_coll = NULL;
>  	int lpi_nr;
> +	struct vgic_irq *irq;
>  
>  	device = find_its_device(its, device_id);
>  	if (!device)
> @@ -720,7 +721,7 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
>  
>  	itte = find_itte(its, device_id, event_id);
>  	if (!itte) {
> -		itte = kzalloc(sizeof(struct its_itte), GFP_KERNEL);
> +		new_itte = itte = kzalloc(sizeof(struct its_itte), GFP_KERNEL);

Nit: Aren't double assignments frowned upon in the kernel?

>  		if (!itte) {
>  			if (new_coll)
>  				vgic_its_free_collection(its, coll_id);
> @@ -733,7 +734,16 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
>  
>  	itte->collection = collection;
>  	itte->lpi = lpi_nr;
> -	itte->irq = vgic_add_lpi(kvm, lpi_nr);
> +
> +	irq = vgic_add_lpi(kvm, lpi_nr);
> +	if (IS_ERR(irq)) {
> +		if (new_coll)
> +			vgic_its_free_collection(its, coll_id);
> +		kfree(new_itte);

But at this point we already have added that ITTE to the
device->itt_head, haven't we?
Since we hold the its_lock, would a simple:

		if (new_itte) {
			list_del(&itte->itte_list);
			kfree(new_itte);
		}

suffice to fix this?

> +		return PTR_ERR(irq);
> +	}
> +	itte->irq = irq;
> +
>  	update_affinity_itte(kvm, itte);
>  
>  	/*
> 

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] KVM: arm64: vgic-its: Plug race in vgic_put_irq
  2016-08-03 16:13 ` [PATCH 2/3] KVM: arm64: vgic-its: Plug race in vgic_put_irq Christoffer Dall
@ 2016-08-08 11:20   ` Andre Przywara
  2016-08-09 10:20     ` Christoffer Dall
  0 siblings, 1 reply; 13+ messages in thread
From: Andre Przywara @ 2016-08-08 11:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 03/08/16 17:13, Christoffer Dall wrote:
> Right now the following sequence of events can happen:
> 
>   1. Thread X calls vgic_put_irq
>   2. Thread Y calls vgic_add_lpi
>   3. Thread Y gets lpi_list_lock
>   4. Thread X drops the ref count to 0 and blocks on lpi_list_lock
>   5. Thread Y finds the irq via the lpi_list_lock, raises the ref
>      count to 1, and release the lpi_list_lock.
>   6. Thread X proceeds and frees the irq.
> 
> Avoid this by holding the spinlock around the kref_put.
> 
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> ---
>  virt/kvm/arm/vgic/vgic.c | 20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index e7aeac7..fb8c0ab 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -117,22 +117,22 @@ static void vgic_irq_release(struct kref *ref)
>  
>  void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>  {
> -	struct vgic_dist *dist;
> +	struct vgic_dist *dist = &kvm->arch.vgic;
>  
>  	if (irq->intid < VGIC_MIN_LPI)
>  		return;
>  
> -	if (!kref_put(&irq->refcount, vgic_irq_release))
> -		return;
> -
> -	dist = &kvm->arch.vgic;
> -
>  	spin_lock(&dist->lpi_list_lock);
> -	list_del(&irq->lpi_list);
> -	dist->lpi_list_count--;
> -	spin_unlock(&dist->lpi_list_lock);
> +	if (!kref_put(&irq->refcount, vgic_irq_release)) {
> +		spin_unlock(&dist->lpi_list_lock);
> +		return;
> +	} else {

Just a nit, I guess, but we don't need this "else" since the if-branch
always returns?

> +		list_del(&irq->lpi_list);
> +		dist->lpi_list_count--;
> +		spin_unlock(&dist->lpi_list_lock);
>  
> -	kfree(irq);
> +		kfree(irq);
> +	}
>  }
>  
>  /**
> 

Otherwise:

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
  2016-08-03 16:13 ` [PATCH 3/3] KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic Christoffer Dall
@ 2016-08-08 13:30   ` Andre Przywara
  2016-08-09 10:30     ` Christoffer Dall
  0 siblings, 1 reply; 13+ messages in thread
From: Andre Przywara @ 2016-08-08 13:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 03/08/16 17:13, Christoffer Dall wrote:
> There are two problems with the current implementation of the MMIO
> handlers for the propbaser and pendbaser:
> 
> First, the write to the value itself is not guaranteed to be an atomic
> 64-bit write so two concurrent writes to the structure field could be
> intermixed.
> 
> Second, because we do a read-modify-update operation without any
> synchronization, if we have two 32-bit accesses to separate parts of the
> register, we can loose one of them.

I am still not 100% convinced that this is necessary, but leave it up to
the judgement of you senior guys.

> We can take the KVM mutex to synchronize accesses to these registers.
> 
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> ---
>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> index ff668e0..e38b7a0 100644
> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> @@ -306,16 +306,19 @@ static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
>  {
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> -	u64 propbaser = dist->propbaser;
> +	u64 propbaser;
>  
>  	/* Storing a value with LPIs already enabled is undefined */
>  	if (vgic_cpu->lpis_enabled)
>  		return;
>  
> +	mutex_lock(&vcpu->kvm->lock);

I see that kvm->lock is becoming problematic in the future, since the
userland save/restore path for GICv2 is taking this lock as well. So we
have to come up with something better once we use migration on
GICv3/ITS. I have the gut feeling we need an extra lock for those two
registers.
But this is not an issue for the purpose of this fix in the current code
base.

Do we need to add the kvm->lock to our locking order documentation?

> +	propbaser = dist->propbaser;
>  	propbaser = update_64bit_reg(propbaser, addr & 4, len, val);
>  	propbaser = vgic_sanitise_propbaser(propbaser);
>  
>  	dist->propbaser = propbaser;
> +	mutex_unlock(&vcpu->kvm->lock);
>  }
>  
>  static unsigned long vgic_mmio_read_pendbase(struct kvm_vcpu *vcpu,
> @@ -331,16 +334,19 @@ static void vgic_mmio_write_pendbase(struct kvm_vcpu *vcpu,
>  				     unsigned long val)
>  {
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> -	u64 pendbaser = vgic_cpu->pendbaser;
> +	u64 pendbaser;
>  
>  	/* Storing a value with LPIs already enabled is undefined */
>  	if (vgic_cpu->lpis_enabled)
>  		return;
>  
> +	mutex_lock(&vcpu->kvm->lock);
> +	pendbaser = vgic_cpu->pendbaser;
>  	pendbaser = update_64bit_reg(pendbaser, addr & 4, len, val);
>  	pendbaser = vgic_sanitise_pendbaser(pendbaser);
>  
>  	vgic_cpu->pendbaser = pendbaser;
> +	mutex_unlock(&vcpu->kvm->lock);
>  }
>  
>  /*
> 

I checked all hits of 'git grep "kvm->lock"' (minus
arch/<some_obscure_arch>), and apart from that above mentioned GICv2
save/restore path couldn't find anything that looks prone to deadlocks
(on the first glance).
Also I enabled some locking checks in .config and am running three SMP
guests with ITS emulation simultaneously at the moment: the kernel
didn't complain so far.

So this looks like a safe change to me.

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/3] KVM: arm64: vgic-its: Handle errors from vgic_add_lpi
  2016-08-08 11:00   ` Andre Przywara
@ 2016-08-09 10:09     ` Christoffer Dall
  0 siblings, 0 replies; 13+ messages in thread
From: Christoffer Dall @ 2016-08-09 10:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 08, 2016 at 12:00:50PM +0100, Andre Przywara wrote:
> Hi,
> 
> On 03/08/16 17:13, Christoffer Dall wrote:
> > During low memory conditions, we could be dereferencing a NULL pointer
> > when vgic_add_lpi fails to allocate memory.
> > 
> > Consider for example this call sequence:
> > 
> >   vgic_its_cmd_handle_mapi
> >       itte->irq = vgic_add_lpi(kvm, lpi_nr);
> 
> Ouch! Thanks for catching this unhandled error return!
> 
> >           update_lpi_config(kvm, itte->irq, NULL);
> >               ret = kvm_read_guest(kvm, propbase + irq->intid
> > 	                                             ^^^^
> > 						     kaboom?
> > 
> > Instead, return an error pointer from vgic_add_lpi and check the return
> > value from its single caller.
> > 
> > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> > ---
> > Changes since v1:
> >  - Don't errornously get an extra kref refernce for the struct vgic_irq
> >  - Don't rework the entire error handling of the function, but follow
> >    what Marc suggested he prefers based on his fixup patch.
> >  virt/kvm/arm/vgic/vgic-its.c | 18 ++++++++++++++----
> >  1 file changed, 14 insertions(+), 4 deletions(-)
> > 
> > diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> > index 07411cf..424f7a5 100644
> > --- a/virt/kvm/arm/vgic/vgic-its.c
> > +++ b/virt/kvm/arm/vgic/vgic-its.c
> > @@ -51,7 +51,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 intid)
> >  
> >  	irq = kzalloc(sizeof(struct vgic_irq), GFP_KERNEL);
> >  	if (!irq)
> > -		return NULL;
> > +		return ERR_PTR(-ENOMEM);
> >  
> >  	INIT_LIST_HEAD(&irq->lpi_list);
> >  	INIT_LIST_HEAD(&irq->ap_list);
> > @@ -693,10 +693,11 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
> >  	u32 device_id = its_cmd_get_deviceid(its_cmd);
> >  	u32 event_id = its_cmd_get_id(its_cmd);
> >  	u32 coll_id = its_cmd_get_collection(its_cmd);
> > -	struct its_itte *itte;
> > +	struct its_itte *itte, *new_itte = NULL;
> >  	struct its_device *device;
> >  	struct its_collection *collection, *new_coll = NULL;
> >  	int lpi_nr;
> > +	struct vgic_irq *irq;
> >  
> >  	device = find_its_device(its, device_id);
> >  	if (!device)
> > @@ -720,7 +721,7 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
> >  
> >  	itte = find_itte(its, device_id, event_id);
> >  	if (!itte) {
> > -		itte = kzalloc(sizeof(struct its_itte), GFP_KERNEL);
> > +		new_itte = itte = kzalloc(sizeof(struct its_itte), GFP_KERNEL);
> 
> Nit: Aren't double assignments frowned upon in the kernel?
> 

Seems like it is accoding to CodingStyle, although it can be found
numerous places in the code base.  But you're right, let's follow the
official style.

> >  		if (!itte) {
> >  			if (new_coll)
> >  				vgic_its_free_collection(its, coll_id);
> > @@ -733,7 +734,16 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
> >  
> >  	itte->collection = collection;
> >  	itte->lpi = lpi_nr;
> > -	itte->irq = vgic_add_lpi(kvm, lpi_nr);
> > +
> > +	irq = vgic_add_lpi(kvm, lpi_nr);
> > +	if (IS_ERR(irq)) {
> > +		if (new_coll)
> > +			vgic_its_free_collection(its, coll_id);
> > +		kfree(new_itte);
> 
> But at this point we already have added that ITTE to the
> device->itt_head, haven't we?
> Since we hold the its_lock, would a simple:
> 
> 		if (new_itte) {
> 			list_del(&itte->itte_list);
> 			kfree(new_itte);
> 		}
> 
> suffice to fix this?
> 

hmm, it would be good to call its_free_itte for this.  But then that
would put a reference on an IRQ, which wouldn't necessarily have been
taken.  That could be reworked by changing its_free_itte like this:

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 424f7a5..6342c92 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -502,7 +502,8 @@ static void its_free_itte(struct kvm *kvm, struct its_itte *itte)
 	list_del(&itte->itte_list);
 
 	/* This put matches the get in vgic_add_lpi. */
-	vgic_put_irq(kvm, itte->irq);
+	if (iite->irq)
+		vgic_put_irq(kvm, itte->irq);
 
 	kfree(itte);
 }

But this makes me wonder how we're really dealing with reference counts
in the case where you find an itte and don't need to allocate one.
Would this BUG_ON ever fire?:

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 424f7a5..a33fbf1 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -730,6 +730,8 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
 
 		itte->event_id	= event_id;
 		list_add_tail(&itte->itte_list, &device->itt_head);
+	} else {
+		BUG_ON(itte->irq);
 	}
 
 	itte->collection = collection;

If not, I don't understand how you can just assign the irq field on
the itte without putting whatever IRQ there may already be held with a
reference there.

Can you explain me the flow of how an itte is allocated, but not
assigned an IRQ, and then later found in vgic_its_cmd_handle_mapi?

> > +		return PTR_ERR(irq);
> > +	}
> > +	itte->irq = irq;
> > +
> >  	update_affinity_itte(kvm, itte);
> >  
> >  	/*
> > 
> 
Thanks,
-Christoffer

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] KVM: arm64: vgic-its: Plug race in vgic_put_irq
  2016-08-08 11:20   ` Andre Przywara
@ 2016-08-09 10:20     ` Christoffer Dall
  0 siblings, 0 replies; 13+ messages in thread
From: Christoffer Dall @ 2016-08-09 10:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 08, 2016 at 12:20:14PM +0100, Andre Przywara wrote:
> Hi,
> 
> On 03/08/16 17:13, Christoffer Dall wrote:
> > Right now the following sequence of events can happen:
> > 
> >   1. Thread X calls vgic_put_irq
> >   2. Thread Y calls vgic_add_lpi
> >   3. Thread Y gets lpi_list_lock
> >   4. Thread X drops the ref count to 0 and blocks on lpi_list_lock
> >   5. Thread Y finds the irq via the lpi_list_lock, raises the ref
> >      count to 1, and release the lpi_list_lock.
> >   6. Thread X proceeds and frees the irq.
> > 
> > Avoid this by holding the spinlock around the kref_put.
> > 
> > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> > ---
> >  virt/kvm/arm/vgic/vgic.c | 20 ++++++++++----------
> >  1 file changed, 10 insertions(+), 10 deletions(-)
> > 
> > diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> > index e7aeac7..fb8c0ab 100644
> > --- a/virt/kvm/arm/vgic/vgic.c
> > +++ b/virt/kvm/arm/vgic/vgic.c
> > @@ -117,22 +117,22 @@ static void vgic_irq_release(struct kref *ref)
> >  
> >  void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
> >  {
> > -	struct vgic_dist *dist;
> > +	struct vgic_dist *dist = &kvm->arch.vgic;
> >  
> >  	if (irq->intid < VGIC_MIN_LPI)
> >  		return;
> >  
> > -	if (!kref_put(&irq->refcount, vgic_irq_release))
> > -		return;
> > -
> > -	dist = &kvm->arch.vgic;
> > -
> >  	spin_lock(&dist->lpi_list_lock);
> > -	list_del(&irq->lpi_list);
> > -	dist->lpi_list_count--;
> > -	spin_unlock(&dist->lpi_list_lock);
> > +	if (!kref_put(&irq->refcount, vgic_irq_release)) {
> > +		spin_unlock(&dist->lpi_list_lock);
> > +		return;
> > +	} else {
> 
> Just a nit, I guess, but we don't need this "else" since the if-branch
> always returns?
> 

correct, I can change that.

> > +		list_del(&irq->lpi_list);
> > +		dist->lpi_list_count--;
> > +		spin_unlock(&dist->lpi_list_lock);
> >  
> > -	kfree(irq);
> > +		kfree(irq);
> > +	}
> >  }
> >  
> >  /**
> > 
> 
> Otherwise:
> 
> Reviewed-by: Andre Przywara <andre.przywara@arm.com>
> 
Thanks.
-Christoffer

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
  2016-08-08 13:30   ` Andre Przywara
@ 2016-08-09 10:30     ` Christoffer Dall
  2016-08-09 10:43       ` Christoffer Dall
  2016-08-09 11:19       ` Andre Przywara
  0 siblings, 2 replies; 13+ messages in thread
From: Christoffer Dall @ 2016-08-09 10:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 08, 2016 at 02:30:38PM +0100, Andre Przywara wrote:
> Hi,
> 
> On 03/08/16 17:13, Christoffer Dall wrote:
> > There are two problems with the current implementation of the MMIO
> > handlers for the propbaser and pendbaser:
> > 
> > First, the write to the value itself is not guaranteed to be an atomic
> > 64-bit write so two concurrent writes to the structure field could be
> > intermixed.
> > 
> > Second, because we do a read-modify-update operation without any
> > synchronization, if we have two 32-bit accesses to separate parts of the
> > register, we can loose one of them.
> 
> I am still not 100% convinced that this is necessary, but leave it up to
> the judgement of you senior guys.

ok, consider this case:

    reg = 0x55555555 55555555;

    CPU 0                CPU 1
    -----                -----
    tmp = reg;
                         tmp = reg;
    tmp[63:32] = ~0;
                         tmp[31:0] = 0;
    reg = tmp;
                         reg = tmp;

    print("reg is %x", reg);
      /* reg is 0x55555555 00000000 */

which is weird, because I think in hardware you'll get:
    0xffffffff 00000000

no matter how you order the two 32-bit updates.

That is, unless the architecture tells us that you could observe the
above behavior.


> 
> > We can take the KVM mutex to synchronize accesses to these registers.
> > 
> > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> > ---
> >  virt/kvm/arm/vgic/vgic-mmio-v3.c | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> > index ff668e0..e38b7a0 100644
> > --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
> > +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> > @@ -306,16 +306,19 @@ static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
> >  {
> >  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> > -	u64 propbaser = dist->propbaser;
> > +	u64 propbaser;
> >  
> >  	/* Storing a value with LPIs already enabled is undefined */
> >  	if (vgic_cpu->lpis_enabled)
> >  		return;
> >  
> > +	mutex_lock(&vcpu->kvm->lock);
> 
> I see that kvm->lock is becoming problematic in the future, since the
> userland save/restore path for GICv2 is taking this lock as well. So we
> have to come up with something better once we use migration on
> GICv3/ITS. I have the gut feeling we need an extra lock for those two
> registers.

that's why I started with a distributor lock, but you talked my out of
it on IRC.  I'll just change this patch to introduce the distributor
lock.  It's ok to have that as long as we don't grab it all over, which
we won't.

> But this is not an issue for the purpose of this fix in the current code
> base.
> 
> Do we need to add the kvm->lock to our locking order documentation?
> 

I'll think about this.

> > +	propbaser = dist->propbaser;
> >  	propbaser = update_64bit_reg(propbaser, addr & 4, len, val);
> >  	propbaser = vgic_sanitise_propbaser(propbaser);
> >  
> >  	dist->propbaser = propbaser;
> > +	mutex_unlock(&vcpu->kvm->lock);
> >  }
> >  
> >  static unsigned long vgic_mmio_read_pendbase(struct kvm_vcpu *vcpu,
> > @@ -331,16 +334,19 @@ static void vgic_mmio_write_pendbase(struct kvm_vcpu *vcpu,
> >  				     unsigned long val)
> >  {
> >  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> > -	u64 pendbaser = vgic_cpu->pendbaser;
> > +	u64 pendbaser;
> >  
> >  	/* Storing a value with LPIs already enabled is undefined */
> >  	if (vgic_cpu->lpis_enabled)
> >  		return;
> >  
> > +	mutex_lock(&vcpu->kvm->lock);
> > +	pendbaser = vgic_cpu->pendbaser;
> >  	pendbaser = update_64bit_reg(pendbaser, addr & 4, len, val);
> >  	pendbaser = vgic_sanitise_pendbaser(pendbaser);
> >  
> >  	vgic_cpu->pendbaser = pendbaser;
> > +	mutex_unlock(&vcpu->kvm->lock);
> >  }
> >  
> >  /*
> > 
> 
> I checked all hits of 'git grep "kvm->lock"' (minus
> arch/<some_obscure_arch>), and apart from that above mentioned GICv2
> save/restore path couldn't find anything that looks prone to deadlocks
> (on the first glance).
> Also I enabled some locking checks in .config and am running three SMP
> guests with ITS emulation simultaneously at the moment: the kernel
> didn't complain so far.
> 
> So this looks like a safe change to me.
> 
> Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Thanks for doing that.  It should be trivial to verify that this works
with a dedicated distributor lock as well though.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
  2016-08-09 10:30     ` Christoffer Dall
@ 2016-08-09 10:43       ` Christoffer Dall
  2016-08-09 11:19       ` Andre Przywara
  1 sibling, 0 replies; 13+ messages in thread
From: Christoffer Dall @ 2016-08-09 10:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 09, 2016 at 12:30:12PM +0200, Christoffer Dall wrote:
> On Mon, Aug 08, 2016 at 02:30:38PM +0100, Andre Przywara wrote:
> > Hi,
> > 
> > On 03/08/16 17:13, Christoffer Dall wrote:
> > > There are two problems with the current implementation of the MMIO
> > > handlers for the propbaser and pendbaser:
> > > 
> > > First, the write to the value itself is not guaranteed to be an atomic
> > > 64-bit write so two concurrent writes to the structure field could be
> > > intermixed.
> > > 
> > > Second, because we do a read-modify-update operation without any
> > > synchronization, if we have two 32-bit accesses to separate parts of the
> > > register, we can loose one of them.
> > 
> > I am still not 100% convinced that this is necessary, but leave it up to
> > the judgement of you senior guys.
> 
> ok, consider this case:
> 
>     reg = 0x55555555 55555555;
> 
>     CPU 0                CPU 1
>     -----                -----
>     tmp = reg;
>                          tmp = reg;
>     tmp[63:32] = ~0;
>                          tmp[31:0] = 0;
>     reg = tmp;
>                          reg = tmp;
> 
>     print("reg is %x", reg);
>       /* reg is 0x55555555 00000000 */
> 
> which is weird, because I think in hardware you'll get:
>     0xffffffff 00000000
> 
> no matter how you order the two 32-bit updates.
> 
> That is, unless the architecture tells us that you could observe the
> above behavior.
> 
> 
> > 
> > > We can take the KVM mutex to synchronize accesses to these registers.
> > > 
> > > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> > > ---
> > >  virt/kvm/arm/vgic/vgic-mmio-v3.c | 10 ++++++++--
> > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> > > index ff668e0..e38b7a0 100644
> > > --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
> > > +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> > > @@ -306,16 +306,19 @@ static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
> > >  {
> > >  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> > >  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> > > -	u64 propbaser = dist->propbaser;
> > > +	u64 propbaser;
> > >  
> > >  	/* Storing a value with LPIs already enabled is undefined */
> > >  	if (vgic_cpu->lpis_enabled)
> > >  		return;
> > >  
> > > +	mutex_lock(&vcpu->kvm->lock);
> > 
> > I see that kvm->lock is becoming problematic in the future, since the
> > userland save/restore path for GICv2 is taking this lock as well. So we
> > have to come up with something better once we use migration on
> > GICv3/ITS. I have the gut feeling we need an extra lock for those two
> > registers.
> 
> that's why I started with a distributor lock, but you talked my out of
> it on IRC.  I'll just change this patch to introduce the distributor
> lock.  It's ok to have that as long as we don't grab it all over, which
> we won't.
> 
> > But this is not an issue for the purpose of this fix in the current code
> > base.
> > 
> > Do we need to add the kvm->lock to our locking order documentation?
> > 
> 
> I'll think about this.
> 
So I think Marc had the better intuition here, and by just using a
cmpxchg64 we can get around introducing more locks etc. so I took at
stab at this.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
  2016-08-09 10:30     ` Christoffer Dall
  2016-08-09 10:43       ` Christoffer Dall
@ 2016-08-09 11:19       ` Andre Przywara
  2016-08-09 11:56         ` Christoffer Dall
  1 sibling, 1 reply; 13+ messages in thread
From: Andre Przywara @ 2016-08-09 11:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 09/08/16 11:30, Christoffer Dall wrote:
> On Mon, Aug 08, 2016 at 02:30:38PM +0100, Andre Przywara wrote:
>> Hi,
>>
>> On 03/08/16 17:13, Christoffer Dall wrote:
>>> There are two problems with the current implementation of the MMIO
>>> handlers for the propbaser and pendbaser:
>>>
>>> First, the write to the value itself is not guaranteed to be an atomic
>>> 64-bit write so two concurrent writes to the structure field could be
>>> intermixed.
>>>
>>> Second, because we do a read-modify-update operation without any
>>> synchronization, if we have two 32-bit accesses to separate parts of the
>>> register, we can loose one of them.
>>
>> I am still not 100% convinced that this is necessary, but leave it up to
>> the judgement of you senior guys.
> 
> ok, consider this case:
> 
>     reg = 0x55555555 55555555;
> 
>     CPU 0                CPU 1
>     -----                -----
>     tmp = reg;
>                          tmp = reg;
>     tmp[63:32] = ~0;
>                          tmp[31:0] = 0;
>     reg = tmp;
>                          reg = tmp;
> 
>     print("reg is %x", reg);
>       /* reg is 0x55555555 00000000 */
> 
> which is weird, because I think in hardware you'll get:
>     0xffffffff 00000000
> 
> no matter how you order the two 32-bit updates.
> 
> That is, unless the architecture tells us that you could observe the
> above behavior.

OK, I can see that this case is indeed broken. I was just wondering how
much software can expect if it allows unsynchronized accesses from
different CPUs to such a 64-bit register - even if it is for separate
halves of that register. I'd expect (guest) software to take a lock if
it can't atomically update prop/pendbaser and there are other agents
possibly chiming in.
And I hope we sanitize them enough now to avoid any bad things to happen
in the kernel ;-)

>>
>>> We can take the KVM mutex to synchronize accesses to these registers.
>>>
>>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
>>> ---
>>>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 10 ++++++++--
>>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>> index ff668e0..e38b7a0 100644
>>> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>> @@ -306,16 +306,19 @@ static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
>>>  {
>>>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>> -	u64 propbaser = dist->propbaser;
>>> +	u64 propbaser;
>>>  
>>>  	/* Storing a value with LPIs already enabled is undefined */
>>>  	if (vgic_cpu->lpis_enabled)
>>>  		return;
>>>  
>>> +	mutex_lock(&vcpu->kvm->lock);
>>
>> I see that kvm->lock is becoming problematic in the future, since the
>> userland save/restore path for GICv2 is taking this lock as well. So we
>> have to come up with something better once we use migration on
>> GICv3/ITS. I have the gut feeling we need an extra lock for those two
>> registers.
> 
> that's why I started with a distributor lock, but you talked my out of
> it on IRC.  I'll just change this patch to introduce the distributor
> lock.  It's ok to have that as long as we don't grab it all over, which
> we won't.
> 
>> But this is not an issue for the purpose of this fix in the current code
>> base.
>>
>> Do we need to add the kvm->lock to our locking order documentation?
>>
> 
> I'll think about this.
> 
>>> +	propbaser = dist->propbaser;
>>>  	propbaser = update_64bit_reg(propbaser, addr & 4, len, val);
>>>  	propbaser = vgic_sanitise_propbaser(propbaser);
>>>  
>>>  	dist->propbaser = propbaser;
>>> +	mutex_unlock(&vcpu->kvm->lock);
>>>  }
>>>  
>>>  static unsigned long vgic_mmio_read_pendbase(struct kvm_vcpu *vcpu,
>>> @@ -331,16 +334,19 @@ static void vgic_mmio_write_pendbase(struct kvm_vcpu *vcpu,
>>>  				     unsigned long val)
>>>  {
>>>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>> -	u64 pendbaser = vgic_cpu->pendbaser;
>>> +	u64 pendbaser;
>>>  
>>>  	/* Storing a value with LPIs already enabled is undefined */
>>>  	if (vgic_cpu->lpis_enabled)
>>>  		return;
>>>  
>>> +	mutex_lock(&vcpu->kvm->lock);
>>> +	pendbaser = vgic_cpu->pendbaser;
>>>  	pendbaser = update_64bit_reg(pendbaser, addr & 4, len, val);
>>>  	pendbaser = vgic_sanitise_pendbaser(pendbaser);
>>>  
>>>  	vgic_cpu->pendbaser = pendbaser;
>>> +	mutex_unlock(&vcpu->kvm->lock);
>>>  }
>>>  
>>>  /*
>>>
>>
>> I checked all hits of 'git grep "kvm->lock"' (minus
>> arch/<some_obscure_arch>), and apart from that above mentioned GICv2
>> save/restore path couldn't find anything that looks prone to deadlocks
>> (on the first glance).
>> Also I enabled some locking checks in .config and am running three SMP
>> guests with ITS emulation simultaneously at the moment: the kernel
>> didn't complain so far.
>>
>> So this looks like a safe change to me.
>>
>> Reviewed-by: Andre Przywara <andre.przywara@arm.com>
> 
> Thanks for doing that.  It should be trivial to verify that this works
> with a dedicated distributor lock as well though.

Indeed it should be much easier. I think those two registers are special
because they affect the whole system (not just one ITS) and are not
interrupt specific as most of the distributor registers. And they are
the only writeable redistributor registers we actually implement.
That's why I was reluctant to re-introduce a BKL style dist lock for
just those two.

Looking forward to your cmpxchg solution ;-)

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
  2016-08-09 11:19       ` Andre Przywara
@ 2016-08-09 11:56         ` Christoffer Dall
  0 siblings, 0 replies; 13+ messages in thread
From: Christoffer Dall @ 2016-08-09 11:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 09, 2016 at 12:19:53PM +0100, Andre Przywara wrote:
> Hi,
> 
> On 09/08/16 11:30, Christoffer Dall wrote:
> > On Mon, Aug 08, 2016 at 02:30:38PM +0100, Andre Przywara wrote:
> >> Hi,
> >>
> >> On 03/08/16 17:13, Christoffer Dall wrote:
> >>> There are two problems with the current implementation of the MMIO
> >>> handlers for the propbaser and pendbaser:
> >>>
> >>> First, the write to the value itself is not guaranteed to be an atomic
> >>> 64-bit write so two concurrent writes to the structure field could be
> >>> intermixed.
> >>>
> >>> Second, because we do a read-modify-update operation without any
> >>> synchronization, if we have two 32-bit accesses to separate parts of the
> >>> register, we can loose one of them.
> >>
> >> I am still not 100% convinced that this is necessary, but leave it up to
> >> the judgement of you senior guys.
> > 
> > ok, consider this case:
> > 
> >     reg = 0x55555555 55555555;
> > 
> >     CPU 0                CPU 1
> >     -----                -----
> >     tmp = reg;
> >                          tmp = reg;
> >     tmp[63:32] = ~0;
> >                          tmp[31:0] = 0;
> >     reg = tmp;
> >                          reg = tmp;
> > 
> >     print("reg is %x", reg);
> >       /* reg is 0x55555555 00000000 */
> > 
> > which is weird, because I think in hardware you'll get:
> >     0xffffffff 00000000
> > 
> > no matter how you order the two 32-bit updates.
> > 
> > That is, unless the architecture tells us that you could observe the
> > above behavior.
> 
> OK, I can see that this case is indeed broken. I was just wondering how
> much software can expect if it allows unsynchronized accesses from
> different CPUs to such a 64-bit register - even if it is for separate
> halves of that register. I'd expect (guest) software to take a lock if
> it can't atomically update prop/pendbaser and there are other agents
> possibly chiming in.
> And I hope we sanitize them enough now to avoid any bad things to happen
> in the kernel ;-)
> 
I don't think it's 100% unreasonable to potentially imagine independent
updates of parts of the register when the spec explicitly says this is
allowed.

I agree, that it would be a curious guest (and I raised this point once
already), but I think relying on this is a terrible approach to
emulating hardware and without any comment or anything in the kernel
saying 'this is safe and reasonable because of so and so' it just looks
too dodgy for me to live with.

-Christoffer

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-08-09 11:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-03 16:13 [PATCH 0/3] KVM: arm64: vgic-its fixes Christoffer Dall
2016-08-03 16:13 ` [PATCH v2 1/3] KVM: arm64: vgic-its: Handle errors from vgic_add_lpi Christoffer Dall
2016-08-08 11:00   ` Andre Przywara
2016-08-09 10:09     ` Christoffer Dall
2016-08-03 16:13 ` [PATCH 2/3] KVM: arm64: vgic-its: Plug race in vgic_put_irq Christoffer Dall
2016-08-08 11:20   ` Andre Przywara
2016-08-09 10:20     ` Christoffer Dall
2016-08-03 16:13 ` [PATCH 3/3] KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic Christoffer Dall
2016-08-08 13:30   ` Andre Przywara
2016-08-09 10:30     ` Christoffer Dall
2016-08-09 10:43       ` Christoffer Dall
2016-08-09 11:19       ` Andre Przywara
2016-08-09 11:56         ` Christoffer Dall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).