* [PATCH] KVM: avoid taking ioapic mutex for non-ioapic EOIs
@ 2009-12-28 12:08 Avi Kivity
2009-12-28 20:37 ` Marcelo Tosatti
0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2009-12-28 12:08 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Gleb Natapov, kvm
When the guest acknowledges an interrupt, it sends an EOI message to the local
apic, which broadcasts it to the ioapic. To handle the EOI, we need to take
the ioapic mutex.
On large guests, this causes a lot of contention on this mutex. Since large
guests usually don't route interrupts via the ioapic (they use msi instead),
this is completely unnecessary.
Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking
the mutex, check if the ioapic was actually responsible for the acked vector.
If not, we can return early.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
virt/kvm/ioapic.c | 19 +++++++++++++++++++
virt/kvm/ioapic.h | 1 +
2 files changed, 20 insertions(+), 0 deletions(-)
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index f01392f..a2edfd1 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -100,6 +100,19 @@ static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
return injected;
}
+static void update_handled_vectors(struct kvm_ioapic *ioapic)
+{
+ DECLARE_BITMAP(handled_vectors, 256);
+ int i;
+
+ memset(handled_vectors, 0, sizeof(handled_vectors));
+ for (i = 0; i < IOAPIC_NUM_PINS; ++i)
+ __set_bit(ioapic->redirtbl[i].fields.vector, handled_vectors);
+ memcpy(ioapic->handled_vectors, handled_vectors,
+ sizeof(handled_vectors));
+ smp_wmb();
+}
+
static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
{
unsigned index;
@@ -134,6 +147,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
e->bits |= (u32) val;
e->fields.remote_irr = 0;
}
+ update_handled_vectors(ioapic);
mask_after = e->fields.mask;
if (mask_before != mask_after)
kvm_fire_mask_notifiers(ioapic->kvm, index, mask_after);
@@ -241,6 +255,9 @@ void kvm_ioapic_update_eoi(struct kvm *kvm, int vector, int trigger_mode)
{
struct kvm_ioapic *ioapic = kvm->arch.vioapic;
+ smp_rmb();
+ if (!test_bit(vector, ioapic->handled_vectors))
+ return;
mutex_lock(&ioapic->lock);
__kvm_ioapic_update_eoi(ioapic, vector, trigger_mode);
mutex_unlock(&ioapic->lock);
@@ -352,6 +369,7 @@ void kvm_ioapic_reset(struct kvm_ioapic *ioapic)
ioapic->ioregsel = 0;
ioapic->irr = 0;
ioapic->id = 0;
+ update_handled_vectors(ioapic);
}
static const struct kvm_io_device_ops ioapic_mmio_ops = {
@@ -401,6 +419,7 @@ int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state)
mutex_lock(&ioapic->lock);
memcpy(ioapic, state, sizeof(struct kvm_ioapic_state));
+ update_handled_vectors(ioapic);
mutex_unlock(&ioapic->lock);
return 0;
}
diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
index 419c43b..a505ce9 100644
--- a/virt/kvm/ioapic.h
+++ b/virt/kvm/ioapic.h
@@ -46,6 +46,7 @@ struct kvm_ioapic {
struct kvm *kvm;
void (*ack_notifier)(void *opaque, int irq);
struct mutex lock;
+ DECLARE_BITMAP(handled_vectors, 256);
};
#ifdef DEBUG
--
1.6.5.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] KVM: avoid taking ioapic mutex for non-ioapic EOIs
2009-12-28 12:08 [PATCH] KVM: avoid taking ioapic mutex for non-ioapic EOIs Avi Kivity
@ 2009-12-28 20:37 ` Marcelo Tosatti
2009-12-28 20:47 ` Avi Kivity
0 siblings, 1 reply; 6+ messages in thread
From: Marcelo Tosatti @ 2009-12-28 20:37 UTC (permalink / raw)
To: Avi Kivity; +Cc: Gleb Natapov, kvm
On Mon, Dec 28, 2009 at 02:08:30PM +0200, Avi Kivity wrote:
> When the guest acknowledges an interrupt, it sends an EOI message to the local
> apic, which broadcasts it to the ioapic. To handle the EOI, we need to take
> the ioapic mutex.
>
> On large guests, this causes a lot of contention on this mutex. Since large
> guests usually don't route interrupts via the ioapic (they use msi instead),
> this is completely unnecessary.
>
> Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking
> the mutex, check if the ioapic was actually responsible for the acked vector.
> If not, we can return early.
Can't you skip IOAPIC EOI for edge triggered interrupts (in the LAPIC
code), instead?
> Signed-off-by: Avi Kivity <avi@redhat.com>
> ---
> virt/kvm/ioapic.c | 19 +++++++++++++++++++
> virt/kvm/ioapic.h | 1 +
> 2 files changed, 20 insertions(+), 0 deletions(-)
>
> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> index f01392f..a2edfd1 100644
> --- a/virt/kvm/ioapic.c
> +++ b/virt/kvm/ioapic.c
> @@ -100,6 +100,19 @@ static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
> return injected;
> }
>
> +static void update_handled_vectors(struct kvm_ioapic *ioapic)
> +{
> + DECLARE_BITMAP(handled_vectors, 256);
> + int i;
> +
> + memset(handled_vectors, 0, sizeof(handled_vectors));
> + for (i = 0; i < IOAPIC_NUM_PINS; ++i)
> + __set_bit(ioapic->redirtbl[i].fields.vector, handled_vectors);
> + memcpy(ioapic->handled_vectors, handled_vectors,
> + sizeof(handled_vectors));
> + smp_wmb();
> +}
> +
> static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
> {
> unsigned index;
> @@ -134,6 +147,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
> e->bits |= (u32) val;
> e->fields.remote_irr = 0;
> }
> + update_handled_vectors(ioapic);
> mask_after = e->fields.mask;
> if (mask_before != mask_after)
> kvm_fire_mask_notifiers(ioapic->kvm, index, mask_after);
> @@ -241,6 +255,9 @@ void kvm_ioapic_update_eoi(struct kvm *kvm, int vector, int trigger_mode)
> {
> struct kvm_ioapic *ioapic = kvm->arch.vioapic;
>
> + smp_rmb();
> + if (!test_bit(vector, ioapic->handled_vectors))
> + return;
> mutex_lock(&ioapic->lock);
> __kvm_ioapic_update_eoi(ioapic, vector, trigger_mode);
> mutex_unlock(&ioapic->lock);
> @@ -352,6 +369,7 @@ void kvm_ioapic_reset(struct kvm_ioapic *ioapic)
> ioapic->ioregsel = 0;
> ioapic->irr = 0;
> ioapic->id = 0;
> + update_handled_vectors(ioapic);
> }
>
> static const struct kvm_io_device_ops ioapic_mmio_ops = {
> @@ -401,6 +419,7 @@ int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state)
>
> mutex_lock(&ioapic->lock);
> memcpy(ioapic, state, sizeof(struct kvm_ioapic_state));
> + update_handled_vectors(ioapic);
> mutex_unlock(&ioapic->lock);
> return 0;
> }
> diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
> index 419c43b..a505ce9 100644
> --- a/virt/kvm/ioapic.h
> +++ b/virt/kvm/ioapic.h
> @@ -46,6 +46,7 @@ struct kvm_ioapic {
> struct kvm *kvm;
> void (*ack_notifier)(void *opaque, int irq);
> struct mutex lock;
> + DECLARE_BITMAP(handled_vectors, 256);
> };
>
> #ifdef DEBUG
> --
> 1.6.5.3
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] KVM: avoid taking ioapic mutex for non-ioapic EOIs
2009-12-28 20:37 ` Marcelo Tosatti
@ 2009-12-28 20:47 ` Avi Kivity
2009-12-28 21:30 ` Marcelo Tosatti
0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2009-12-28 20:47 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Gleb Natapov, kvm
On 12/28/2009 10:37 PM, Marcelo Tosatti wrote:
> On Mon, Dec 28, 2009 at 02:08:30PM +0200, Avi Kivity wrote:
>
>> When the guest acknowledges an interrupt, it sends an EOI message to the local
>> apic, which broadcasts it to the ioapic. To handle the EOI, we need to take
>> the ioapic mutex.
>>
>> On large guests, this causes a lot of contention on this mutex. Since large
>> guests usually don't route interrupts via the ioapic (they use msi instead),
>> this is completely unnecessary.
>>
>> Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking
>> the mutex, check if the ioapic was actually responsible for the acked vector.
>> If not, we can return early.
>>
> Can't you skip IOAPIC EOI for edge triggered interrupts (in the LAPIC
> code), instead?
>
That's a lot cleaner, yes. Indeed there's the TMR which holds this
info. Gleb suggested doing this in the local apic but we didn't think
of using the TMR.
There's a small race there - the TMR is set after the IRR, so the
interrupt can be injected and acked before the TMR is updated, but that
can be fixed by switching the order.
But what about kvm_notify_acked_irq() in __kvm_ioapic_update_eoi()?
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] KVM: avoid taking ioapic mutex for non-ioapic EOIs
2009-12-28 20:47 ` Avi Kivity
@ 2009-12-28 21:30 ` Marcelo Tosatti
2009-12-29 10:35 ` Avi Kivity
0 siblings, 1 reply; 6+ messages in thread
From: Marcelo Tosatti @ 2009-12-28 21:30 UTC (permalink / raw)
To: Avi Kivity; +Cc: Gleb Natapov, kvm
On Mon, Dec 28, 2009 at 10:47:20PM +0200, Avi Kivity wrote:
> On 12/28/2009 10:37 PM, Marcelo Tosatti wrote:
>> On Mon, Dec 28, 2009 at 02:08:30PM +0200, Avi Kivity wrote:
>>
>>> When the guest acknowledges an interrupt, it sends an EOI message to the local
>>> apic, which broadcasts it to the ioapic. To handle the EOI, we need to take
>>> the ioapic mutex.
>>>
>>> On large guests, this causes a lot of contention on this mutex. Since large
>>> guests usually don't route interrupts via the ioapic (they use msi instead),
>>> this is completely unnecessary.
>>>
>>> Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking
>>> the mutex, check if the ioapic was actually responsible for the acked vector.
>>> If not, we can return early.
>>>
>> Can't you skip IOAPIC EOI for edge triggered interrupts (in the LAPIC
>> code), instead?
>>
>
> That's a lot cleaner, yes. Indeed there's the TMR which holds this
> info. Gleb suggested doing this in the local apic but we didn't think
> of using the TMR.
Problem with storing in the LAPIC is you have to migrate the bitmap
along (otherwise can't know if EOI is from MSI or IOAPIC). But it sounds
much simpler.
> There's a small race there - the TMR is set after the IRR, so the
> interrupt can be injected and acked before the TMR is updated, but that
> can be fixed by switching the order.
Makes sense.
> But what about kvm_notify_acked_irq() in __kvm_ioapic_update_eoi()?
Oops.
The worrying thing about the handled_vectors bitmap in the IOAPIC is
that the update is not atomic wrt to lapic EOI handler.
Unless its certain that races there are the guests problem, which should
have proper locking to never allow things like
kvm_set_ioapic vec
update handled bitmap, vec not IOAPIC
handled anymore
ack lapic irq vec
to happen.
(with bitmap in LAPIC you avoid those things).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] KVM: avoid taking ioapic mutex for non-ioapic EOIs
2009-12-28 21:30 ` Marcelo Tosatti
@ 2009-12-29 10:35 ` Avi Kivity
2009-12-29 16:59 ` Marcelo Tosatti
0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2009-12-29 10:35 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Gleb Natapov, kvm
On 12/28/2009 11:30 PM, Marcelo Tosatti wrote:
> On Mon, Dec 28, 2009 at 10:47:20PM +0200, Avi Kivity wrote:
>
>> On 12/28/2009 10:37 PM, Marcelo Tosatti wrote:
>>
>>> On Mon, Dec 28, 2009 at 02:08:30PM +0200, Avi Kivity wrote:
>>>
>>>
>>>> When the guest acknowledges an interrupt, it sends an EOI message to the local
>>>> apic, which broadcasts it to the ioapic. To handle the EOI, we need to take
>>>> the ioapic mutex.
>>>>
>>>> On large guests, this causes a lot of contention on this mutex. Since large
>>>> guests usually don't route interrupts via the ioapic (they use msi instead),
>>>> this is completely unnecessary.
>>>>
>>>> Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking
>>>> the mutex, check if the ioapic was actually responsible for the acked vector.
>>>> If not, we can return early.
>>>>
>>>>
>>> Can't you skip IOAPIC EOI for edge triggered interrupts (in the LAPIC
>>> code), instead?
>>>
>>>
>> That's a lot cleaner, yes. Indeed there's the TMR which holds this
>> info. Gleb suggested doing this in the local apic but we didn't think
>> of using the TMR.
>>
> Problem with storing in the LAPIC is you have to migrate the bitmap
> along (otherwise can't know if EOI is from MSI or IOAPIC). But it sounds
> much simpler.
>
If we move the vectors_handled bitmap to the local apic, I don't see how
it simplified things.
>> There's a small race there - the TMR is set after the IRR, so the
>> interrupt can be injected and acked before the TMR is updated, but that
>> can be fixed by switching the order.
>>
> Makes sense.
>
Btw, that race is already exposed to the guest, if it cares to read
TMR. I'll send a patch.
>
>> But what about kvm_notify_acked_irq() in __kvm_ioapic_update_eoi()?
>>
> Oops.
>
> The worrying thing about the handled_vectors bitmap in the IOAPIC is
> that the update is not atomic wrt to lapic EOI handler.
>
> Unless its certain that races there are the guests problem, which should
> have proper locking to never allow things like
>
> kvm_set_ioapic vec
> update handled bitmap, vec not IOAPIC
> handled anymore
> ack lapic irq vec
>
> to happen.
>
> (with bitmap in LAPIC you avoid those things).
>
>
It seems real hardware will have the same issue (also look at comments
regarding irq migration in arch/x86/kernel/io_apic.c). So I think a
guest is required to ack before migrating an irq.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] KVM: avoid taking ioapic mutex for non-ioapic EOIs
2009-12-29 10:35 ` Avi Kivity
@ 2009-12-29 16:59 ` Marcelo Tosatti
0 siblings, 0 replies; 6+ messages in thread
From: Marcelo Tosatti @ 2009-12-29 16:59 UTC (permalink / raw)
To: Avi Kivity; +Cc: Gleb Natapov, kvm
On Tue, Dec 29, 2009 at 12:35:17PM +0200, Avi Kivity wrote:
> On 12/28/2009 11:30 PM, Marcelo Tosatti wrote:
>> On Mon, Dec 28, 2009 at 10:47:20PM +0200, Avi Kivity wrote:
>>
>>> On 12/28/2009 10:37 PM, Marcelo Tosatti wrote:
>>>
>>>> On Mon, Dec 28, 2009 at 02:08:30PM +0200, Avi Kivity wrote:
>>>>
>>>>
>>>>> When the guest acknowledges an interrupt, it sends an EOI message to the local
>>>>> apic, which broadcasts it to the ioapic. To handle the EOI, we need to take
>>>>> the ioapic mutex.
>>>>>
>>>>> On large guests, this causes a lot of contention on this mutex. Since large
>>>>> guests usually don't route interrupts via the ioapic (they use msi instead),
>>>>> this is completely unnecessary.
>>>>>
>>>>> Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking
>>>>> the mutex, check if the ioapic was actually responsible for the acked vector.
>>>>> If not, we can return early.
>>>>>
>>>>>
>>>> Can't you skip IOAPIC EOI for edge triggered interrupts (in the LAPIC
>>>> code), instead?
>>>>
>>>>
>>> That's a lot cleaner, yes. Indeed there's the TMR which holds this
>>> info. Gleb suggested doing this in the local apic but we didn't think
>>> of using the TMR.
>>>
>> Problem with storing in the LAPIC is you have to migrate the bitmap
>> along (otherwise can't know if EOI is from MSI or IOAPIC). But it sounds
>> much simpler.
>>
>
> If we move the vectors_handled bitmap to the local apic, I don't see how
> it simplified things.
Its vcpu-local.
>>> There's a small race there - the TMR is set after the IRR, so the
>>> interrupt can be injected and acked before the TMR is updated, but that
>>> can be fixed by switching the order.
>>>
>> Makes sense.
>>
>
> Btw, that race is already exposed to the guest, if it cares to read TMR.
> I'll send a patch.
>
>>
>>> But what about kvm_notify_acked_irq() in __kvm_ioapic_update_eoi()?
>>>
>> Oops.
>>
>> The worrying thing about the handled_vectors bitmap in the IOAPIC is
>> that the update is not atomic wrt to lapic EOI handler.
>>
>> Unless its certain that races there are the guests problem, which should
>> have proper locking to never allow things like
>>
>> kvm_set_ioapic vec
>> update handled bitmap, vec not IOAPIC
>> handled anymore
>> ack lapic irq vec
>>
>> to happen.
>>
>> (with bitmap in LAPIC you avoid those things).
>>
>>
>
> It seems real hardware will have the same issue (also look at comments
> regarding irq migration in arch/x86/kernel/io_apic.c). So I think a
> guest is required to ack before migrating an irq.
Fair. Applied, thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-12-29 22:10 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-28 12:08 [PATCH] KVM: avoid taking ioapic mutex for non-ioapic EOIs Avi Kivity
2009-12-28 20:37 ` Marcelo Tosatti
2009-12-28 20:47 ` Avi Kivity
2009-12-28 21:30 ` Marcelo Tosatti
2009-12-29 10:35 ` Avi Kivity
2009-12-29 16:59 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox