* Re: [Qemu-devel] [PATCH 1/1 V4] qemu-kvm: fix improper nmi emulation
2011-10-14 6:49 ` Jan Kiszka
@ 2011-10-14 7:43 ` Lai Jiangshan
2011-10-14 8:31 ` Jan Kiszka
2011-10-14 9:03 ` [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and " Lai Jiangshan
` (4 subsequent siblings)
5 siblings, 1 reply; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-14 7:43 UTC (permalink / raw)
To: Jan Kiszka
Cc: kvm@vger.kernel.org, Avi Kivity, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 10/14/2011 02:49 PM, Jan Kiszka wrote:
> On 2011-10-14 08:36, Lai Jiangshan wrote:
>> On 10/14/2011 01:53 PM, Jan Kiszka wrote:
>>> On 2011-10-14 02:53, Lai Jiangshan wrote:
>>>>
>>>>>
>>>>> As explained in some other mail, we could then emulate the missing
>>>>> kernel feature by reading out the current in-kernel APIC state, testing
>>>>> if LINT1 is unmasked, and then delivering the NMI directly.
>>>>>
>>>>
>>>> Only the thread of the VCPU can safely get the in-kernel LAPIC states,
>>>> so this approach will cause some troubles.
>>>
>>> run_on_cpu() can help.
>>>
>>> Jan
>>>
>>
>> Ah, I forgot it, Thanks.
>>
>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>>
>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>> button event happens. This doesn't properly emulate real hardware on
>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>> the processor even when LINT1 is maskied in LVT. For example, this
>> causes the problem that kdump initiated by NMI sometimes doesn't work
>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>
>> With this patch, inject-nmi request is handled as follows.
>>
>> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
>> interrupt.
>> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states
>> and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
>> delivering the NMI directly. (Suggested by Jan Kiszka)
>>
>> Changed from old version:
>> re-implement it by the Jan's suggestion.
>>
>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>> ---
>> hw/apic.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>> hw/apic.h | 1 +
>> monitor.c | 6 +++++-
>> 3 files changed, 54 insertions(+), 1 deletions(-)
>> diff --git a/hw/apic.c b/hw/apic.c
>> index 69d6ac5..9a40129 100644
>> --- a/hw/apic.c
>> +++ b/hw/apic.c
>> @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
>> }
>> }
>>
>> +#ifdef KVM_CAP_IRQCHIP
>
> Again, this is always defined on x86 thus pointless to test.
>
>> +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id);
>> +
>> +struct kvm_get_remote_lapic_params {
>> + CPUState *env;
>> + struct kvm_lapic_state klapic;
>> +};
>> +
>> +static void kvm_get_remote_lapic(void *p)
>> +{
>> + struct kvm_get_remote_lapic_params *params = p;
>> +
>> + kvm_get_lapic(params->env, ¶ms->klapic);
>
> When you already interrupted that vcpu, why not inject from here? Avoids
> one further ping-pong round.
get_remote_lapic and inject nmi are two different things,
so I don't inject nmi from here. I didn't notice this ping-pond overhead.
Thank you.
>
>> +}
>> +
>> +void apic_deliver_nmi(DeviceState *d)
>> +{
>> + APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
>> +
>> + if (kvm_irqchip_in_kernel()) {
>> + struct kvm_get_remote_lapic_params p = {.env = s->cpu_env,};
>> + uint32_t lvt;
>> +
>> + run_on_cpu(s->cpu_env, kvm_get_remote_lapic, &p);
>> + lvt = kapic_reg(&p.klapic, 0x32 + APIC_LVT_LINT1);
>> +
>> + if (lvt & APIC_LVT_MASKED) {
>> + return;
>> + }
>> +
>> + if (((lvt >> 8) & 7) != APIC_DM_NMI) {
>> + return;
>> + }
>> +
>> + cpu_interrupt(s->cpu_env, CPU_INTERRUPT_NMI);
>
> Err, aren't you introducing KVM_CAP_LAPIC_NMI that allows to test if
> this workaround is needed? Oh, your latest kernel patch is missing this
> again - requires fixing as well.
>
Kernel site patch is dropped with this v4 patch.
Did you mean you want KVM_CAP_SET_LINT1 + KVM_SET_LINT1 patches?
I have made them.
Sent soon.
Thanks,
Lai
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V4] qemu-kvm: fix improper nmi emulation
2011-10-14 7:43 ` Lai Jiangshan
@ 2011-10-14 8:31 ` Jan Kiszka
0 siblings, 0 replies; 69+ messages in thread
From: Jan Kiszka @ 2011-10-14 8:31 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, Avi Kivity, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
[-- Attachment #1: Type: text/plain, Size: 4287 bytes --]
On 2011-10-14 09:43, Lai Jiangshan wrote:
> On 10/14/2011 02:49 PM, Jan Kiszka wrote:
>> On 2011-10-14 08:36, Lai Jiangshan wrote:
>>> On 10/14/2011 01:53 PM, Jan Kiszka wrote:
>>>> On 2011-10-14 02:53, Lai Jiangshan wrote:
>>>>>
>>>>>>
>>>>>> As explained in some other mail, we could then emulate the missing
>>>>>> kernel feature by reading out the current in-kernel APIC state, testing
>>>>>> if LINT1 is unmasked, and then delivering the NMI directly.
>>>>>>
>>>>>
>>>>> Only the thread of the VCPU can safely get the in-kernel LAPIC states,
>>>>> so this approach will cause some troubles.
>>>>
>>>> run_on_cpu() can help.
>>>>
>>>> Jan
>>>>
>>>
>>> Ah, I forgot it, Thanks.
>>>
>>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>>>
>>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>>> button event happens. This doesn't properly emulate real hardware on
>>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>>> the processor even when LINT1 is maskied in LVT. For example, this
>>> causes the problem that kdump initiated by NMI sometimes doesn't work
>>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>>
>>> With this patch, inject-nmi request is handled as follows.
>>>
>>> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
>>> interrupt.
>>> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states
>>> and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
>>> delivering the NMI directly. (Suggested by Jan Kiszka)
>>>
>>> Changed from old version:
>>> re-implement it by the Jan's suggestion.
>>>
>>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>>> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>>> ---
>>> hw/apic.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>> hw/apic.h | 1 +
>>> monitor.c | 6 +++++-
>>> 3 files changed, 54 insertions(+), 1 deletions(-)
>>> diff --git a/hw/apic.c b/hw/apic.c
>>> index 69d6ac5..9a40129 100644
>>> --- a/hw/apic.c
>>> +++ b/hw/apic.c
>>> @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
>>> }
>>> }
>>>
>>> +#ifdef KVM_CAP_IRQCHIP
>>
>> Again, this is always defined on x86 thus pointless to test.
>>
>>> +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id);
>>> +
>>> +struct kvm_get_remote_lapic_params {
>>> + CPUState *env;
>>> + struct kvm_lapic_state klapic;
>>> +};
>>> +
>>> +static void kvm_get_remote_lapic(void *p)
>>> +{
>>> + struct kvm_get_remote_lapic_params *params = p;
>>> +
>>> + kvm_get_lapic(params->env, ¶ms->klapic);
>>
>> When you already interrupted that vcpu, why not inject from here? Avoids
>> one further ping-pong round.
>
> get_remote_lapic and inject nmi are two different things,
> so I don't inject nmi from here. I didn't notice this ping-pond overhead.
> Thank you.
Actually, it is not performance-critical. But there is a race between
obtaining the APIC state and testing for the NMI injection path. So it's
better to define an on-vcpu LINT1 NMI injection service.
>
>>
>>> +}
>>> +
>>> +void apic_deliver_nmi(DeviceState *d)
>>> +{
>>> + APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
>>> +
>>> + if (kvm_irqchip_in_kernel()) {
>>> + struct kvm_get_remote_lapic_params p = {.env = s->cpu_env,};
>>> + uint32_t lvt;
>>> +
>>> + run_on_cpu(s->cpu_env, kvm_get_remote_lapic, &p);
>>> + lvt = kapic_reg(&p.klapic, 0x32 + APIC_LVT_LINT1);
>>> +
>>> + if (lvt & APIC_LVT_MASKED) {
>>> + return;
>>> + }
>>> +
>>> + if (((lvt >> 8) & 7) != APIC_DM_NMI) {
>>> + return;
>>> + }
>>> +
>>> + cpu_interrupt(s->cpu_env, CPU_INTERRUPT_NMI);
>>
>> Err, aren't you introducing KVM_CAP_LAPIC_NMI that allows to test if
>> this workaround is needed? Oh, your latest kernel patch is missing this
>> again - requires fixing as well.
>>
>
>
> Kernel site patch is dropped with this v4 patch.
>
> Did you mean you want KVM_CAP_SET_LINT1 + KVM_SET_LINT1 patches?
> I have made them.
OK, so this is going to be applied on top? Then I take this remark back.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 69+ messages in thread
* [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-14 6:49 ` Jan Kiszka
2011-10-14 7:43 ` Lai Jiangshan
@ 2011-10-14 9:03 ` Lai Jiangshan
2011-10-14 9:07 ` Jan Kiszka
2011-10-16 9:39 ` Avi Kivity
2011-10-14 9:03 ` [Qemu-devel] [PATCH 1/2 V5] qemu-kvm: Synchronize kernel headers Lai Jiangshan
` (3 subsequent siblings)
5 siblings, 2 replies; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-14 9:03 UTC (permalink / raw)
To: Jan Kiszka
Cc: kvm@vger.kernel.org, Avi Kivity, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
button event happens. This doesn't properly emulate real hardware on
which NMI button event triggers LINT1. Because of this, NMI is sent to
the processor even when LINT1 is masked in LVT. For example, this
causes the problem that kdump initiated by NMI sometimes doesn't work
on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
With this patch, we introduce introduce KVM_SET_LINT1,
and we can use KVM_SET_LINT1 to correctly emulate NMI button
without change the old KVM_NMI behavior.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
---
arch/x86/include/asm/kvm.h | 1 +
arch/x86/kvm/irq.h | 1 +
arch/x86/kvm/lapic.c | 7 +++++++
arch/x86/kvm/x86.c | 8 ++++++++
include/linux/kvm.h | 5 +++++
5 files changed, 22 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h
index 4d8dcbd..88d0ac3 100644
--- a/arch/x86/include/asm/kvm.h
+++ b/arch/x86/include/asm/kvm.h
@@ -24,6 +24,7 @@
#define __KVM_HAVE_DEBUGREGS
#define __KVM_HAVE_XSAVE
#define __KVM_HAVE_XCRS
+#define __KVM_HAVE_SET_LINT1
/* Architectural interrupt line count. */
#define KVM_NR_INTERRUPTS 256
diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
index 53e2d08..0c96315 100644
--- a/arch/x86/kvm/irq.h
+++ b/arch/x86/kvm/irq.h
@@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s);
void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
+void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 57dcbd4..87fe36a 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
kvm_apic_local_deliver(apic, APIC_LVT0);
}
+void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
+{
+ struct kvm_lapic *apic = vcpu->arch.apic;
+
+ kvm_apic_local_deliver(apic, APIC_LVT1);
+}
+
static struct kvm_timer_ops lapic_timer_ops = {
.is_periodic = lapic_is_periodic,
};
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 84a28ea..fccd094 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_XSAVE:
case KVM_CAP_ASYNC_PF:
case KVM_CAP_GET_TSC_KHZ:
+ case KVM_CAP_SET_LINT1:
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
@@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
goto out;
}
+ case KVM_SET_LINT1: {
+ r = -EINVAL;
+ if (!irqchip_in_kernel(vcpu->kvm))
+ goto out;
+ r = 0;
+ kvm_apic_lint1_deliver(vcpu);
+ }
default:
r = -EINVAL;
}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index aace6b8..3a10572 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -554,6 +554,9 @@ struct kvm_ppc_pvinfo {
#define KVM_CAP_PPC_SMT 64
#define KVM_CAP_PPC_RMA 65
#define KVM_CAP_S390_GMAP 71
+#ifdef __KVM_HAVE_SET_LINT1
+#define KVM_CAP_SET_LINT1 72
+#endif
#ifdef KVM_CAP_IRQ_ROUTING
@@ -759,6 +762,8 @@ struct kvm_clock_data {
#define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce)
/* Available with KVM_CAP_RMA */
#define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma)
+/* Available with KVM_CAP_SET_LINT1 for x86 */
+#define KVM_SET_LINT1 _IO(KVMIO, 0xaa)
#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-14 9:03 ` [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and " Lai Jiangshan
@ 2011-10-14 9:07 ` Jan Kiszka
2011-10-14 9:27 ` Lai Jiangshan
2011-10-16 9:39 ` Avi Kivity
1 sibling, 1 reply; 69+ messages in thread
From: Jan Kiszka @ 2011-10-14 9:07 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, Avi Kivity, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
[-- Attachment #1: Type: text/plain, Size: 3794 bytes --]
On 2011-10-14 11:03, Lai Jiangshan wrote:
> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
> button event happens. This doesn't properly emulate real hardware on
> which NMI button event triggers LINT1. Because of this, NMI is sent to
> the processor even when LINT1 is masked in LVT. For example, this
> causes the problem that kdump initiated by NMI sometimes doesn't work
> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>
> With this patch, we introduce introduce KVM_SET_LINT1,
> and we can use KVM_SET_LINT1 to correctly emulate NMI button
> without change the old KVM_NMI behavior.
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> ---
> arch/x86/include/asm/kvm.h | 1 +
> arch/x86/kvm/irq.h | 1 +
> arch/x86/kvm/lapic.c | 7 +++++++
> arch/x86/kvm/x86.c | 8 ++++++++
> include/linux/kvm.h | 5 +++++
> 5 files changed, 22 insertions(+), 0 deletions(-)
> diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h
> index 4d8dcbd..88d0ac3 100644
> --- a/arch/x86/include/asm/kvm.h
> +++ b/arch/x86/include/asm/kvm.h
> @@ -24,6 +24,7 @@
> #define __KVM_HAVE_DEBUGREGS
> #define __KVM_HAVE_XSAVE
> #define __KVM_HAVE_XCRS
> +#define __KVM_HAVE_SET_LINT1
>
> /* Architectural interrupt line count. */
> #define KVM_NR_INTERRUPTS 256
> diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
> index 53e2d08..0c96315 100644
> --- a/arch/x86/kvm/irq.h
> +++ b/arch/x86/kvm/irq.h
> @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s);
> void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
> void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
> void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
> void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
> void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
> void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 57dcbd4..87fe36a 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
> kvm_apic_local_deliver(apic, APIC_LVT0);
> }
>
> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_lapic *apic = vcpu->arch.apic;
> +
> + kvm_apic_local_deliver(apic, APIC_LVT1);
> +}
> +
> static struct kvm_timer_ops lapic_timer_ops = {
> .is_periodic = lapic_is_periodic,
> };
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 84a28ea..fccd094 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> case KVM_CAP_XSAVE:
> case KVM_CAP_ASYNC_PF:
> case KVM_CAP_GET_TSC_KHZ:
> + case KVM_CAP_SET_LINT1:
> r = 1;
> break;
> case KVM_CAP_COALESCED_MMIO:
> @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>
> goto out;
> }
> + case KVM_SET_LINT1: {
> + r = -EINVAL;
> + if (!irqchip_in_kernel(vcpu->kvm))
> + goto out;
> + r = 0;
> + kvm_apic_lint1_deliver(vcpu);
> + }
> default:
> r = -EINVAL;
> }
> diff --git a/include/linux/kvm.h b/include/linux/kvm.h
> index aace6b8..3a10572 100644
> --- a/include/linux/kvm.h
> +++ b/include/linux/kvm.h
> @@ -554,6 +554,9 @@ struct kvm_ppc_pvinfo {
> #define KVM_CAP_PPC_SMT 64
> #define KVM_CAP_PPC_RMA 65
> #define KVM_CAP_S390_GMAP 71
> +#ifdef __KVM_HAVE_SET_LINT1
> +#define KVM_CAP_SET_LINT1 72
> +#endif
Actually, there is no need for __KVM_HAVE_SET_LINT1 and #ifdef. User
land will just do a runtime check.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-14 9:07 ` Jan Kiszka
@ 2011-10-14 9:27 ` Lai Jiangshan
2011-10-14 9:32 ` Jan Kiszka
0 siblings, 1 reply; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-14 9:27 UTC (permalink / raw)
To: Jan Kiszka
Cc: Kenji Kaneshige, KAMEZAWA Hiroyuki, Avi Kivity,
kvm@vger.kernel.org, qemu-devel@nongnu.org
On 10/14/2011 05:07 PM, Jan Kiszka wrote:
> On 2011-10-14 11:03, Lai Jiangshan wrote:
>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>> button event happens. This doesn't properly emulate real hardware on
>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>> the processor even when LINT1 is masked in LVT. For example, this
>> causes the problem that kdump initiated by NMI sometimes doesn't work
>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>
>> With this patch, we introduce introduce KVM_SET_LINT1,
>> and we can use KVM_SET_LINT1 to correctly emulate NMI button
>> without change the old KVM_NMI behavior.
>>
>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>> ---
>> arch/x86/include/asm/kvm.h | 1 +
>> arch/x86/kvm/irq.h | 1 +
>> arch/x86/kvm/lapic.c | 7 +++++++
>> arch/x86/kvm/x86.c | 8 ++++++++
>> include/linux/kvm.h | 5 +++++
>> 5 files changed, 22 insertions(+), 0 deletions(-)
>> diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h
>> index 4d8dcbd..88d0ac3 100644
>> --- a/arch/x86/include/asm/kvm.h
>> +++ b/arch/x86/include/asm/kvm.h
>> @@ -24,6 +24,7 @@
>> #define __KVM_HAVE_DEBUGREGS
>> #define __KVM_HAVE_XSAVE
>> #define __KVM_HAVE_XCRS
>> +#define __KVM_HAVE_SET_LINT1
>>
>> /* Architectural interrupt line count. */
>> #define KVM_NR_INTERRUPTS 256
>> diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
>> index 53e2d08..0c96315 100644
>> --- a/arch/x86/kvm/irq.h
>> +++ b/arch/x86/kvm/irq.h
>> @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s);
>> void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
>> void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
>> void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
>> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
>> void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
>> void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
>> void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 57dcbd4..87fe36a 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
>> kvm_apic_local_deliver(apic, APIC_LVT0);
>> }
>>
>> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm_lapic *apic = vcpu->arch.apic;
>> +
>> + kvm_apic_local_deliver(apic, APIC_LVT1);
>> +}
>> +
>> static struct kvm_timer_ops lapic_timer_ops = {
>> .is_periodic = lapic_is_periodic,
>> };
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 84a28ea..fccd094 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>> case KVM_CAP_XSAVE:
>> case KVM_CAP_ASYNC_PF:
>> case KVM_CAP_GET_TSC_KHZ:
>> + case KVM_CAP_SET_LINT1:
>> r = 1;
>> break;
>> case KVM_CAP_COALESCED_MMIO:
>> @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>
>> goto out;
>> }
>> + case KVM_SET_LINT1: {
>> + r = -EINVAL;
>> + if (!irqchip_in_kernel(vcpu->kvm))
>> + goto out;
>> + r = 0;
>> + kvm_apic_lint1_deliver(vcpu);
>> + }
>> default:
>> r = -EINVAL;
>> }
>> diff --git a/include/linux/kvm.h b/include/linux/kvm.h
>> index aace6b8..3a10572 100644
>> --- a/include/linux/kvm.h
>> +++ b/include/linux/kvm.h
>> @@ -554,6 +554,9 @@ struct kvm_ppc_pvinfo {
>> #define KVM_CAP_PPC_SMT 64
>> #define KVM_CAP_PPC_RMA 65
>> #define KVM_CAP_S390_GMAP 71
>> +#ifdef __KVM_HAVE_SET_LINT1
>> +#define KVM_CAP_SET_LINT1 72
>> +#endif
>
> Actually, there is no need for __KVM_HAVE_SET_LINT1 and #ifdef. User
> land will just do a runtime check.
>
>
There is not bad result brought by __KVM_HAVE_SET_LINT1
and help for compile time check.
Thanks,
Lai
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-14 9:27 ` Lai Jiangshan
@ 2011-10-14 9:32 ` Jan Kiszka
0 siblings, 0 replies; 69+ messages in thread
From: Jan Kiszka @ 2011-10-14 9:32 UTC (permalink / raw)
To: Lai Jiangshan
Cc: Kenji Kaneshige, KAMEZAWA Hiroyuki, Avi Kivity,
kvm@vger.kernel.org, qemu-devel@nongnu.org
[-- Attachment #1: Type: text/plain, Size: 4540 bytes --]
On 2011-10-14 11:27, Lai Jiangshan wrote:
> On 10/14/2011 05:07 PM, Jan Kiszka wrote:
>> On 2011-10-14 11:03, Lai Jiangshan wrote:
>>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>>> button event happens. This doesn't properly emulate real hardware on
>>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>>> the processor even when LINT1 is masked in LVT. For example, this
>>> causes the problem that kdump initiated by NMI sometimes doesn't work
>>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>>
>>> With this patch, we introduce introduce KVM_SET_LINT1,
>>> and we can use KVM_SET_LINT1 to correctly emulate NMI button
>>> without change the old KVM_NMI behavior.
>>>
>>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>>> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>>> ---
>>> arch/x86/include/asm/kvm.h | 1 +
>>> arch/x86/kvm/irq.h | 1 +
>>> arch/x86/kvm/lapic.c | 7 +++++++
>>> arch/x86/kvm/x86.c | 8 ++++++++
>>> include/linux/kvm.h | 5 +++++
>>> 5 files changed, 22 insertions(+), 0 deletions(-)
>>> diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h
>>> index 4d8dcbd..88d0ac3 100644
>>> --- a/arch/x86/include/asm/kvm.h
>>> +++ b/arch/x86/include/asm/kvm.h
>>> @@ -24,6 +24,7 @@
>>> #define __KVM_HAVE_DEBUGREGS
>>> #define __KVM_HAVE_XSAVE
>>> #define __KVM_HAVE_XCRS
>>> +#define __KVM_HAVE_SET_LINT1
>>>
>>> /* Architectural interrupt line count. */
>>> #define KVM_NR_INTERRUPTS 256
>>> diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
>>> index 53e2d08..0c96315 100644
>>> --- a/arch/x86/kvm/irq.h
>>> +++ b/arch/x86/kvm/irq.h
>>> @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s);
>>> void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
>>> void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
>>> void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
>>> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
>>> void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
>>> void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
>>> void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>> index 57dcbd4..87fe36a 100644
>>> --- a/arch/x86/kvm/lapic.c
>>> +++ b/arch/x86/kvm/lapic.c
>>> @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
>>> kvm_apic_local_deliver(apic, APIC_LVT0);
>>> }
>>>
>>> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
>>> +{
>>> + struct kvm_lapic *apic = vcpu->arch.apic;
>>> +
>>> + kvm_apic_local_deliver(apic, APIC_LVT1);
>>> +}
>>> +
>>> static struct kvm_timer_ops lapic_timer_ops = {
>>> .is_periodic = lapic_is_periodic,
>>> };
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 84a28ea..fccd094 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>>> case KVM_CAP_XSAVE:
>>> case KVM_CAP_ASYNC_PF:
>>> case KVM_CAP_GET_TSC_KHZ:
>>> + case KVM_CAP_SET_LINT1:
>>> r = 1;
>>> break;
>>> case KVM_CAP_COALESCED_MMIO:
>>> @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>>
>>> goto out;
>>> }
>>> + case KVM_SET_LINT1: {
>>> + r = -EINVAL;
>>> + if (!irqchip_in_kernel(vcpu->kvm))
>>> + goto out;
>>> + r = 0;
>>> + kvm_apic_lint1_deliver(vcpu);
>>> + }
>>> default:
>>> r = -EINVAL;
>>> }
>>> diff --git a/include/linux/kvm.h b/include/linux/kvm.h
>>> index aace6b8..3a10572 100644
>>> --- a/include/linux/kvm.h
>>> +++ b/include/linux/kvm.h
>>> @@ -554,6 +554,9 @@ struct kvm_ppc_pvinfo {
>>> #define KVM_CAP_PPC_SMT 64
>>> #define KVM_CAP_PPC_RMA 65
>>> #define KVM_CAP_S390_GMAP 71
>>> +#ifdef __KVM_HAVE_SET_LINT1
>>> +#define KVM_CAP_SET_LINT1 72
>>> +#endif
>>
>> Actually, there is no need for __KVM_HAVE_SET_LINT1 and #ifdef. User
>> land will just do a runtime check.
>>
>>
>
> There is not bad result brought by __KVM_HAVE_SET_LINT1
> and help for compile time check.
It's guarding an arch-specific CAP that will only be checked if there is
a need. That's in contrast to generic features that are no supported for
all archs (like __KVM_HAVE_GUEST_DEBUG -> KVM_CAP_SET_GUEST_DEBUG).
Granted, there are quite a few examples for redundant __KVM_HAVE/#ifdef
KVM_CAP in the KVM header, but let's not add more.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-14 9:03 ` [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and " Lai Jiangshan
2011-10-14 9:07 ` Jan Kiszka
@ 2011-10-16 9:39 ` Avi Kivity
2011-10-17 9:17 ` Lai Jiangshan
2011-10-17 9:40 ` Lai Jiangshan
1 sibling, 2 replies; 69+ messages in thread
From: Avi Kivity @ 2011-10-16 9:39 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, Jan Kiszka, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 10/14/2011 11:03 AM, Lai Jiangshan wrote:
> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
> button event happens. This doesn't properly emulate real hardware on
> which NMI button event triggers LINT1. Because of this, NMI is sent to
> the processor even when LINT1 is masked in LVT. For example, this
> causes the problem that kdump initiated by NMI sometimes doesn't work
> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>
> With this patch, we introduce introduce KVM_SET_LINT1,
> and we can use KVM_SET_LINT1 to correctly emulate NMI button
> without change the old KVM_NMI behavior.
>
> @@ -759,6 +762,8 @@ struct kvm_clock_data {
> #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce)
> /* Available with KVM_CAP_RMA */
> #define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma)
> +/* Available with KVM_CAP_SET_LINT1 for x86 */
> +#define KVM_SET_LINT1 _IO(KVMIO, 0xaa)
>
>
LINT1 may have been programmed as a level -triggered interrupt instead
of edge triggered (NMI or interrupt). We can use the ioctl argument for
the level (and pressing the NMI button needs to pulse the level to 1 and
back to 0).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-16 9:39 ` Avi Kivity
@ 2011-10-17 9:17 ` Lai Jiangshan
2011-10-17 9:54 ` Avi Kivity
2011-10-17 9:40 ` Lai Jiangshan
1 sibling, 1 reply; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-17 9:17 UTC (permalink / raw)
To: Avi Kivity
Cc: kvm@vger.kernel.org, Jan Kiszka, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 10/16/2011 05:39 PM, Avi Kivity wrote:
> On 10/14/2011 11:03 AM, Lai Jiangshan wrote:
>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>> button event happens. This doesn't properly emulate real hardware on
>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>> the processor even when LINT1 is masked in LVT. For example, this
>> causes the problem that kdump initiated by NMI sometimes doesn't work
>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>
>> With this patch, we introduce introduce KVM_SET_LINT1,
>> and we can use KVM_SET_LINT1 to correctly emulate NMI button
>> without change the old KVM_NMI behavior.
>>
>> @@ -759,6 +762,8 @@ struct kvm_clock_data {
>> #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce)
>> /* Available with KVM_CAP_RMA */
>> #define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma)
>> +/* Available with KVM_CAP_SET_LINT1 for x86 */
>> +#define KVM_SET_LINT1 _IO(KVMIO, 0xaa)
>>
>>
>
> LINT1 may have been programmed as a level -triggered interrupt instead
> of edge triggered (NMI or interrupt). We can use the ioctl argument for
> the level (and pressing the NMI button needs to pulse the level to 1 and
> back to 0).
>
Hi, Avi,
How to handle level=0 in the kernel?
Or just ignore it?
Thanks,
Lai
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-17 9:17 ` Lai Jiangshan
@ 2011-10-17 9:54 ` Avi Kivity
2011-10-17 10:21 ` Jan Kiszka
0 siblings, 1 reply; 69+ messages in thread
From: Avi Kivity @ 2011-10-17 9:54 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, Jan Kiszka, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 10/17/2011 11:17 AM, Lai Jiangshan wrote:
> On 10/16/2011 05:39 PM, Avi Kivity wrote:
> > On 10/14/2011 11:03 AM, Lai Jiangshan wrote:
> >> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
> >> button event happens. This doesn't properly emulate real hardware on
> >> which NMI button event triggers LINT1. Because of this, NMI is sent to
> >> the processor even when LINT1 is masked in LVT. For example, this
> >> causes the problem that kdump initiated by NMI sometimes doesn't work
> >> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
> >>
> >> With this patch, we introduce introduce KVM_SET_LINT1,
> >> and we can use KVM_SET_LINT1 to correctly emulate NMI button
> >> without change the old KVM_NMI behavior.
> >>
> >> @@ -759,6 +762,8 @@ struct kvm_clock_data {
> >> #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce)
> >> /* Available with KVM_CAP_RMA */
> >> #define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma)
> >> +/* Available with KVM_CAP_SET_LINT1 for x86 */
> >> +#define KVM_SET_LINT1 _IO(KVMIO, 0xaa)
> >>
> >>
> >
> > LINT1 may have been programmed as a level -triggered interrupt instead
> > of edge triggered (NMI or interrupt). We can use the ioctl argument for
> > the level (and pressing the NMI button needs to pulse the level to 1 and
> > back to 0).
> >
>
> Hi, Avi,
>
> How to handle level=0 in the kernel?
> Or just ignore it?
It needs to be handled according to the delivery mode, polarity, and
trigger mode bits in the LVT.
For example, a Fixed delivery mode with polarity 1 and level trigger
mode will post the interrupt as long as it is in level 0 and not masked
by the ISR. __apic_accept_irq() should handle this.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-17 9:54 ` Avi Kivity
@ 2011-10-17 10:21 ` Jan Kiszka
0 siblings, 0 replies; 69+ messages in thread
From: Jan Kiszka @ 2011-10-17 10:21 UTC (permalink / raw)
To: Avi Kivity
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Lai Jiangshan,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 2011-10-17 11:54, Avi Kivity wrote:
> On 10/17/2011 11:17 AM, Lai Jiangshan wrote:
>> On 10/16/2011 05:39 PM, Avi Kivity wrote:
>>> On 10/14/2011 11:03 AM, Lai Jiangshan wrote:
>>>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>>>> button event happens. This doesn't properly emulate real hardware on
>>>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>>>> the processor even when LINT1 is masked in LVT. For example, this
>>>> causes the problem that kdump initiated by NMI sometimes doesn't work
>>>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>>>
>>>> With this patch, we introduce introduce KVM_SET_LINT1,
>>>> and we can use KVM_SET_LINT1 to correctly emulate NMI button
>>>> without change the old KVM_NMI behavior.
>>>>
>>>> @@ -759,6 +762,8 @@ struct kvm_clock_data {
>>>> #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce)
>>>> /* Available with KVM_CAP_RMA */
>>>> #define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma)
>>>> +/* Available with KVM_CAP_SET_LINT1 for x86 */
>>>> +#define KVM_SET_LINT1 _IO(KVMIO, 0xaa)
>>>>
>>>>
>>>
>>> LINT1 may have been programmed as a level -triggered interrupt instead
>>> of edge triggered (NMI or interrupt). We can use the ioctl argument for
>>> the level (and pressing the NMI button needs to pulse the level to 1 and
>>> back to 0).
>>>
>>
>> Hi, Avi,
>>
>> How to handle level=0 in the kernel?
>> Or just ignore it?
>
> It needs to be handled according to the delivery mode, polarity, and
> trigger mode bits in the LVT.
>
> For example, a Fixed delivery mode with polarity 1 and level trigger
> mode will post the interrupt as long as it is in level 0 and not masked
> by the ISR. __apic_accept_irq() should handle this.
But I think it's not yet fully prepared for this (level is only
considered for APIC_DM_INIT e.g.).
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-16 9:39 ` Avi Kivity
2011-10-17 9:17 ` Lai Jiangshan
@ 2011-10-17 9:40 ` Lai Jiangshan
2011-10-17 9:49 ` Avi Kivity
1 sibling, 1 reply; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-17 9:40 UTC (permalink / raw)
To: Avi Kivity
Cc: kvm@vger.kernel.org, Jan Kiszka, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 10/16/2011 05:39 PM, Avi Kivity wrote:
> On 10/14/2011 11:03 AM, Lai Jiangshan wrote:
>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>> button event happens. This doesn't properly emulate real hardware on
>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>> the processor even when LINT1 is masked in LVT. For example, this
>> causes the problem that kdump initiated by NMI sometimes doesn't work
>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>
>> With this patch, we introduce introduce KVM_SET_LINT1,
>> and we can use KVM_SET_LINT1 to correctly emulate NMI button
>> without change the old KVM_NMI behavior.
>>
>> @@ -759,6 +762,8 @@ struct kvm_clock_data {
>> #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce)
>> /* Available with KVM_CAP_RMA */
>> #define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma)
>> +/* Available with KVM_CAP_SET_LINT1 for x86 */
>> +#define KVM_SET_LINT1 _IO(KVMIO, 0xaa)
>>
>>
>
> LINT1 may have been programmed as a level -triggered interrupt instead
> of edge triggered (NMI or interrupt). We can use the ioctl argument for
> the level (and pressing the NMI button needs to pulse the level to 1 and
> back to 0).
>
Hi, Avi, Jan,
Which approach you prefer to?
I need to know the result before wasting too much time to respin
the approach.
1) Fix KVM_NMI emulation approach (which is v3 patchset)
- It directly fixes the problem and matches the
real hard ware more, but it changes KVM_NMI bahavior.
- Require both kernel-site and userspace-site fix.
2) Get the LAPIC state from kernel irqchip, and inject NMI if it is allowed
(which is v4 patchset)
- Simple, don't changes any kernel behavior.
- Only need the userspace-site fix
3) Add KVM_SET_LINT1 approach (which is v5 patchset)
- don't changes the kernel's KVM_NMI behavior.
- much complex
- Require both kernel-site and userspace-site fix.
- userspace-site should also handle the !KVM_SET_LINT1
condition, it uses all the 2) approach' code. it means
this approach equals the 2) approach + KVM_SET_LINT1 ioctl.
This is an urgent bug of us, we need to settle it down soon.
Thanks,
Lai
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-17 9:40 ` Lai Jiangshan
@ 2011-10-17 9:49 ` Avi Kivity
2011-10-17 16:00 ` [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: " Lai Jiangshan
0 siblings, 1 reply; 69+ messages in thread
From: Avi Kivity @ 2011-10-17 9:49 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, Jan Kiszka, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 10/17/2011 11:40 AM, Lai Jiangshan wrote:
> >>
> >
> > LINT1 may have been programmed as a level -triggered interrupt instead
> > of edge triggered (NMI or interrupt). We can use the ioctl argument for
> > the level (and pressing the NMI button needs to pulse the level to 1 and
> > back to 0).
> >
>
> Hi, Avi, Jan,
>
> Which approach you prefer to?
> I need to know the result before wasting too much time to respin
> the approach.
Yes, sorry about the slow and sometimes conflicting feedback.
> 1) Fix KVM_NMI emulation approach (which is v3 patchset)
> - It directly fixes the problem and matches the
> real hard ware more, but it changes KVM_NMI bahavior.
> - Require both kernel-site and userspace-site fix.
>
> 2) Get the LAPIC state from kernel irqchip, and inject NMI if it is allowed
> (which is v4 patchset)
> - Simple, don't changes any kernel behavior.
> - Only need the userspace-site fix
>
> 3) Add KVM_SET_LINT1 approach (which is v5 patchset)
> - don't changes the kernel's KVM_NMI behavior.
> - much complex
> - Require both kernel-site and userspace-site fix.
> - userspace-site should also handle the !KVM_SET_LINT1
> condition, it uses all the 2) approach' code. it means
> this approach equals the 2) approach + KVM_SET_LINT1 ioctl.
>
> This is an urgent bug of us, we need to settle it down soo
While (1) is simple, it overloads a single ioctl with two meanings,
that's not so good.
Whether we do (1) or (3), we need (2) as well, for older kernels.
So I recommend first focusing on (2) and merging it, then doing (3).
(note an additional issue with 3 is whether to make it a vm or vcpu
ioctl - we've been assuming vcpu ioctl but it's not necessarily the best
choice).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 69+ messages in thread
* [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation
2011-10-17 9:49 ` Avi Kivity
@ 2011-10-17 16:00 ` Lai Jiangshan
2011-10-18 19:41 ` Jan Kiszka
2011-12-07 10:29 ` Avi Kivity
0 siblings, 2 replies; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-17 16:00 UTC (permalink / raw)
To: Avi Kivity
Cc: kvm@vger.kernel.org, Jan Kiszka, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 10/17/2011 05:49 PM, Avi Kivity wrote:
> On 10/17/2011 11:40 AM, Lai Jiangshan wrote:
>>>>
>>>
>>> LINT1 may have been programmed as a level -triggered interrupt instead
>>> of edge triggered (NMI or interrupt). We can use the ioctl argument for
>>> the level (and pressing the NMI button needs to pulse the level to 1 and
>>> back to 0).
>>>
>>
>> Hi, Avi, Jan,
>>
>> Which approach you prefer to?
>> I need to know the result before wasting too much time to respin
>> the approach.
>
> Yes, sorry about the slow and sometimes conflicting feedback.
>
>> 1) Fix KVM_NMI emulation approach (which is v3 patchset)
>> - It directly fixes the problem and matches the
>> real hard ware more, but it changes KVM_NMI bahavior.
>> - Require both kernel-site and userspace-site fix.
>>
>> 2) Get the LAPIC state from kernel irqchip, and inject NMI if it is allowed
>> (which is v4 patchset)
>> - Simple, don't changes any kernel behavior.
>> - Only need the userspace-site fix
>>
>> 3) Add KVM_SET_LINT1 approach (which is v5 patchset)
>> - don't changes the kernel's KVM_NMI behavior.
>> - much complex
>> - Require both kernel-site and userspace-site fix.
>> - userspace-site should also handle the !KVM_SET_LINT1
>> condition, it uses all the 2) approach' code. it means
>> this approach equals the 2) approach + KVM_SET_LINT1 ioctl.
>>
>> This is an urgent bug of us, we need to settle it down soo
>
> While (1) is simple, it overloads a single ioctl with two meanings,
> that's not so good.
>
> Whether we do (1) or (3), we need (2) as well, for older kernels.
>
> So I recommend first focusing on (2) and merging it, then doing (3).
>
> (note an additional issue with 3 is whether to make it a vm or vcpu
> ioctl - we've been assuming vcpu ioctl but it's not necessarily the best
> choice).
>
It is the 2) approach.
It only changes the user space site, the kernel site is not touched.
It is changed from previous v4 patch, fixed problems found by Jan.
----------------------------
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
button event happens. This doesn't properly emulate real hardware on
which NMI button event triggers LINT1. Because of this, NMI is sent to
the processor even when LINT1 is maskied in LVT. For example, this
causes the problem that kdump initiated by NMI sometimes doesn't work
on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
With this patch, inject-nmi request is handled as follows.
- When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
interrupt.
- When in-kernel irqchip is enabled, get the in-kernel LAPIC states
and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
delivering the NMI directly. (Suggested by Jan Kiszka)
Changed from old version:
re-implement it by the Jan's suggestion.
fix the race found by Jan.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
---
hw/apic.c | 33 +++++++++++++++++++++++++++++++++
hw/apic.h | 1 +
monitor.c | 6 +++++-
3 files changed, 39 insertions(+), 1 deletions(-)
diff --git a/hw/apic.c b/hw/apic.c
index 69d6ac5..922796a 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -205,6 +205,39 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
}
}
+static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id);
+
+static void kvm_irqchip_deliver_nmi(void *p)
+{
+ APICState *s = p;
+ struct kvm_lapic_state klapic;
+ uint32_t lvt;
+
+ kvm_get_lapic(s->cpu_env, &klapic);
+ lvt = kapic_reg(&klapic, 0x32 + APIC_LVT_LINT1);
+
+ if (lvt & APIC_LVT_MASKED) {
+ return;
+ }
+
+ if (((lvt >> 8) & 7) != APIC_DM_NMI) {
+ return;
+ }
+
+ kvm_vcpu_ioctl(s->cpu_env, KVM_NMI);
+}
+
+void apic_deliver_nmi(DeviceState *d)
+{
+ APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+ if (kvm_irqchip_in_kernel()) {
+ run_on_cpu(s->cpu_env, kvm_irqchip_deliver_nmi, s);
+ } else {
+ apic_local_deliver(s, APIC_LVT_LINT1);
+ }
+}
+
#define foreach_apic(apic, deliver_bitmask, code) \
{\
int __i, __j, __mask;\
diff --git a/hw/apic.h b/hw/apic.h
index c857d52..3a4be0a 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
uint8_t trigger_mode);
int apic_accept_pic_intr(DeviceState *s);
void apic_deliver_pic_intr(DeviceState *s, int level);
+void apic_deliver_nmi(DeviceState *d);
int apic_get_interrupt(DeviceState *s);
void apic_reset_irq_delivered(void);
int apic_get_irq_delivered(void);
diff --git a/monitor.c b/monitor.c
index cb485bf..0b81f17 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2616,7 +2616,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data)
CPUState *env;
for (env = first_cpu; env != NULL; env = env->next_cpu) {
- cpu_interrupt(env, CPU_INTERRUPT_NMI);
+ if (!env->apic_state) {
+ cpu_interrupt(env, CPU_INTERRUPT_NMI);
+ } else {
+ apic_deliver_nmi(env->apic_state);
+ }
}
return 0;
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation
2011-10-17 16:00 ` [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: " Lai Jiangshan
@ 2011-10-18 19:41 ` Jan Kiszka
2011-10-19 6:33 ` Lai Jiangshan
2011-10-19 9:29 ` [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: " Avi Kivity
2011-12-07 10:29 ` Avi Kivity
1 sibling, 2 replies; 69+ messages in thread
From: Jan Kiszka @ 2011-10-18 19:41 UTC (permalink / raw)
To: Lai Jiangshan
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Avi Kivity,
KAMEZAWA Hiroyuki, Kenji Kaneshige
[-- Attachment #1: Type: text/plain, Size: 5795 bytes --]
On 2011-10-17 18:00, Lai Jiangshan wrote:
> On 10/17/2011 05:49 PM, Avi Kivity wrote:
>> On 10/17/2011 11:40 AM, Lai Jiangshan wrote:
>>>>>
>>>>
>>>> LINT1 may have been programmed as a level -triggered interrupt instead
>>>> of edge triggered (NMI or interrupt). We can use the ioctl argument for
>>>> the level (and pressing the NMI button needs to pulse the level to 1 and
>>>> back to 0).
>>>>
>>>
>>> Hi, Avi, Jan,
>>>
>>> Which approach you prefer to?
>>> I need to know the result before wasting too much time to respin
>>> the approach.
>>
>> Yes, sorry about the slow and sometimes conflicting feedback.
>>
>>> 1) Fix KVM_NMI emulation approach (which is v3 patchset)
>>> - It directly fixes the problem and matches the
>>> real hard ware more, but it changes KVM_NMI bahavior.
>>> - Require both kernel-site and userspace-site fix.
>>>
>>> 2) Get the LAPIC state from kernel irqchip, and inject NMI if it is allowed
>>> (which is v4 patchset)
>>> - Simple, don't changes any kernel behavior.
>>> - Only need the userspace-site fix
>>>
>>> 3) Add KVM_SET_LINT1 approach (which is v5 patchset)
>>> - don't changes the kernel's KVM_NMI behavior.
>>> - much complex
>>> - Require both kernel-site and userspace-site fix.
>>> - userspace-site should also handle the !KVM_SET_LINT1
>>> condition, it uses all the 2) approach' code. it means
>>> this approach equals the 2) approach + KVM_SET_LINT1 ioctl.
>>>
>>> This is an urgent bug of us, we need to settle it down soo
>>
>> While (1) is simple, it overloads a single ioctl with two meanings,
>> that's not so good.
>>
>> Whether we do (1) or (3), we need (2) as well, for older kernels.
>>
>> So I recommend first focusing on (2) and merging it, then doing (3).
>>
>> (note an additional issue with 3 is whether to make it a vm or vcpu
>> ioctl - we've been assuming vcpu ioctl but it's not necessarily the best
>> choice).
>>
>
> It is the 2) approach.
> It only changes the user space site, the kernel site is not touched.
> It is changed from previous v4 patch, fixed problems found by Jan.
> ----------------------------
>
> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>
> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
> button event happens. This doesn't properly emulate real hardware on
> which NMI button event triggers LINT1. Because of this, NMI is sent to
> the processor even when LINT1 is maskied in LVT. For example, this
> causes the problem that kdump initiated by NMI sometimes doesn't work
> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>
> With this patch, inject-nmi request is handled as follows.
>
> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
> interrupt.
> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states
> and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
> delivering the NMI directly. (Suggested by Jan Kiszka)
>
> Changed from old version:
> re-implement it by the Jan's suggestion.
> fix the race found by Jan.
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> ---
> hw/apic.c | 33 +++++++++++++++++++++++++++++++++
> hw/apic.h | 1 +
> monitor.c | 6 +++++-
> 3 files changed, 39 insertions(+), 1 deletions(-)
> diff --git a/hw/apic.c b/hw/apic.c
> index 69d6ac5..922796a 100644
> --- a/hw/apic.c
> +++ b/hw/apic.c
> @@ -205,6 +205,39 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
> }
> }
>
> +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id);
> +
> +static void kvm_irqchip_deliver_nmi(void *p)
> +{
> + APICState *s = p;
> + struct kvm_lapic_state klapic;
> + uint32_t lvt;
> +
> + kvm_get_lapic(s->cpu_env, &klapic);
> + lvt = kapic_reg(&klapic, 0x32 + APIC_LVT_LINT1);
> +
> + if (lvt & APIC_LVT_MASKED) {
> + return;
> + }
> +
> + if (((lvt >> 8) & 7) != APIC_DM_NMI) {
> + return;
> + }
> +
> + kvm_vcpu_ioctl(s->cpu_env, KVM_NMI);
> +}
> +
> +void apic_deliver_nmi(DeviceState *d)
> +{
> + APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> + if (kvm_irqchip_in_kernel()) {
> + run_on_cpu(s->cpu_env, kvm_irqchip_deliver_nmi, s);
> + } else {
> + apic_local_deliver(s, APIC_LVT_LINT1);
> + }
> +}
> +
> #define foreach_apic(apic, deliver_bitmask, code) \
> {\
> int __i, __j, __mask;\
> diff --git a/hw/apic.h b/hw/apic.h
> index c857d52..3a4be0a 100644
> --- a/hw/apic.h
> +++ b/hw/apic.h
> @@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
> uint8_t trigger_mode);
> int apic_accept_pic_intr(DeviceState *s);
> void apic_deliver_pic_intr(DeviceState *s, int level);
> +void apic_deliver_nmi(DeviceState *d);
> int apic_get_interrupt(DeviceState *s);
> void apic_reset_irq_delivered(void);
> int apic_get_irq_delivered(void);
> diff --git a/monitor.c b/monitor.c
> index cb485bf..0b81f17 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -2616,7 +2616,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data)
> CPUState *env;
>
> for (env = first_cpu; env != NULL; env = env->next_cpu) {
> - cpu_interrupt(env, CPU_INTERRUPT_NMI);
> + if (!env->apic_state) {
> + cpu_interrupt(env, CPU_INTERRUPT_NMI);
> + } else {
> + apic_deliver_nmi(env->apic_state);
> + }
> }
>
> return 0;
Looks OK to me.
Please don't forget to bake a qemu-only patch for those bits that apply
to upstream as well (ie. the user space APIC path).
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation
2011-10-18 19:41 ` Jan Kiszka
@ 2011-10-19 6:33 ` Lai Jiangshan
2011-10-19 10:57 ` Jan Kiszka
2011-10-19 9:29 ` [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: " Avi Kivity
1 sibling, 1 reply; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-19 6:33 UTC (permalink / raw)
To: Jan Kiszka
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Avi Kivity,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 10/19/2011 03:41 AM, Jan Kiszka wrote:
> On 2011-10-17 18:00, Lai Jiangshan wrote:
>> On 10/17/2011 05:49 PM, Avi Kivity wrote:
>>> On 10/17/2011 11:40 AM, Lai Jiangshan wrote:
>>>>>>
>>>>>
>>>>> LINT1 may have been programmed as a level -triggered interrupt instead
>>>>> of edge triggered (NMI or interrupt). We can use the ioctl argument for
>>>>> the level (and pressing the NMI button needs to pulse the level to 1 and
>>>>> back to 0).
>>>>>
>>>>
>>>> Hi, Avi, Jan,
>>>>
>>>> Which approach you prefer to?
>>>> I need to know the result before wasting too much time to respin
>>>> the approach.
>>>
>>> Yes, sorry about the slow and sometimes conflicting feedback.
>>>
>>>> 1) Fix KVM_NMI emulation approach (which is v3 patchset)
>>>> - It directly fixes the problem and matches the
>>>> real hard ware more, but it changes KVM_NMI bahavior.
>>>> - Require both kernel-site and userspace-site fix.
>>>>
>>>> 2) Get the LAPIC state from kernel irqchip, and inject NMI if it is allowed
>>>> (which is v4 patchset)
>>>> - Simple, don't changes any kernel behavior.
>>>> - Only need the userspace-site fix
>>>>
>>>> 3) Add KVM_SET_LINT1 approach (which is v5 patchset)
>>>> - don't changes the kernel's KVM_NMI behavior.
>>>> - much complex
>>>> - Require both kernel-site and userspace-site fix.
>>>> - userspace-site should also handle the !KVM_SET_LINT1
>>>> condition, it uses all the 2) approach' code. it means
>>>> this approach equals the 2) approach + KVM_SET_LINT1 ioctl.
>>>>
>>>> This is an urgent bug of us, we need to settle it down soo
>>>
>>> While (1) is simple, it overloads a single ioctl with two meanings,
>>> that's not so good.
>>>
>>> Whether we do (1) or (3), we need (2) as well, for older kernels.
>>>
>>> So I recommend first focusing on (2) and merging it, then doing (3).
>>>
>>> (note an additional issue with 3 is whether to make it a vm or vcpu
>>> ioctl - we've been assuming vcpu ioctl but it's not necessarily the best
>>> choice).
>>>
>>
>> It is the 2) approach.
>> It only changes the user space site, the kernel site is not touched.
>> It is changed from previous v4 patch, fixed problems found by Jan.
>> ----------------------------
>>
>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>>
>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>> button event happens. This doesn't properly emulate real hardware on
>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>> the processor even when LINT1 is maskied in LVT. For example, this
>> causes the problem that kdump initiated by NMI sometimes doesn't work
>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>
>> With this patch, inject-nmi request is handled as follows.
>>
>> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
>> interrupt.
>> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states
>> and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
>> delivering the NMI directly. (Suggested by Jan Kiszka)
>>
>> Changed from old version:
>> re-implement it by the Jan's suggestion.
>> fix the race found by Jan.
>>
>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>> ---
>> hw/apic.c | 33 +++++++++++++++++++++++++++++++++
>> hw/apic.h | 1 +
>> monitor.c | 6 +++++-
>> 3 files changed, 39 insertions(+), 1 deletions(-)
>> diff --git a/hw/apic.c b/hw/apic.c
>> index 69d6ac5..922796a 100644
>> --- a/hw/apic.c
>> +++ b/hw/apic.c
>> @@ -205,6 +205,39 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
>> }
>> }
>>
>> +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id);
>> +
>> +static void kvm_irqchip_deliver_nmi(void *p)
>> +{
>> + APICState *s = p;
>> + struct kvm_lapic_state klapic;
>> + uint32_t lvt;
>> +
>> + kvm_get_lapic(s->cpu_env, &klapic);
>> + lvt = kapic_reg(&klapic, 0x32 + APIC_LVT_LINT1);
>> +
>> + if (lvt & APIC_LVT_MASKED) {
>> + return;
>> + }
>> +
>> + if (((lvt >> 8) & 7) != APIC_DM_NMI) {
>> + return;
>> + }
>> +
>> + kvm_vcpu_ioctl(s->cpu_env, KVM_NMI);
>> +}
>> +
>> +void apic_deliver_nmi(DeviceState *d)
>> +{
>> + APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
>> +
>> + if (kvm_irqchip_in_kernel()) {
>> + run_on_cpu(s->cpu_env, kvm_irqchip_deliver_nmi, s);
>> + } else {
>> + apic_local_deliver(s, APIC_LVT_LINT1);
>> + }
>> +}
>> +
>> #define foreach_apic(apic, deliver_bitmask, code) \
>> {\
>> int __i, __j, __mask;\
>> diff --git a/hw/apic.h b/hw/apic.h
>> index c857d52..3a4be0a 100644
>> --- a/hw/apic.h
>> +++ b/hw/apic.h
>> @@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
>> uint8_t trigger_mode);
>> int apic_accept_pic_intr(DeviceState *s);
>> void apic_deliver_pic_intr(DeviceState *s, int level);
>> +void apic_deliver_nmi(DeviceState *d);
>> int apic_get_interrupt(DeviceState *s);
>> void apic_reset_irq_delivered(void);
>> int apic_get_irq_delivered(void);
>> diff --git a/monitor.c b/monitor.c
>> index cb485bf..0b81f17 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -2616,7 +2616,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data)
>> CPUState *env;
>>
>> for (env = first_cpu; env != NULL; env = env->next_cpu) {
>> - cpu_interrupt(env, CPU_INTERRUPT_NMI);
>> + if (!env->apic_state) {
>> + cpu_interrupt(env, CPU_INTERRUPT_NMI);
>> + } else {
>> + apic_deliver_nmi(env->apic_state);
>> + }
>> }
>>
>> return 0;
>
> Looks OK to me.
>
> Please don't forget to bake a qemu-only patch for those bits that apply
> to upstream as well (ie. the user space APIC path).
>
> Jan
>
I did forget it.
Did you mean we need to add "#ifdef KVM_CAP_IRQCHIP" back?
Thanks
Lai
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation
2011-10-19 6:33 ` Lai Jiangshan
@ 2011-10-19 10:57 ` Jan Kiszka
2011-10-19 15:21 ` [Qemu-devel] [PATCH 1/1 V6] qemu: " Lai Jiangshan
0 siblings, 1 reply; 69+ messages in thread
From: Jan Kiszka @ 2011-10-19 10:57 UTC (permalink / raw)
To: Lai Jiangshan
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Avi Kivity,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 2011-10-19 08:33, Lai Jiangshan wrote:
> On 10/19/2011 03:41 AM, Jan Kiszka wrote:
>> On 2011-10-17 18:00, Lai Jiangshan wrote:
>>> On 10/17/2011 05:49 PM, Avi Kivity wrote:
>>>> On 10/17/2011 11:40 AM, Lai Jiangshan wrote:
>>>>>>>
>>>>>>
>>>>>> LINT1 may have been programmed as a level -triggered interrupt instead
>>>>>> of edge triggered (NMI or interrupt). We can use the ioctl argument for
>>>>>> the level (and pressing the NMI button needs to pulse the level to 1 and
>>>>>> back to 0).
>>>>>>
>>>>>
>>>>> Hi, Avi, Jan,
>>>>>
>>>>> Which approach you prefer to?
>>>>> I need to know the result before wasting too much time to respin
>>>>> the approach.
>>>>
>>>> Yes, sorry about the slow and sometimes conflicting feedback.
>>>>
>>>>> 1) Fix KVM_NMI emulation approach (which is v3 patchset)
>>>>> - It directly fixes the problem and matches the
>>>>> real hard ware more, but it changes KVM_NMI bahavior.
>>>>> - Require both kernel-site and userspace-site fix.
>>>>>
>>>>> 2) Get the LAPIC state from kernel irqchip, and inject NMI if it is allowed
>>>>> (which is v4 patchset)
>>>>> - Simple, don't changes any kernel behavior.
>>>>> - Only need the userspace-site fix
>>>>>
>>>>> 3) Add KVM_SET_LINT1 approach (which is v5 patchset)
>>>>> - don't changes the kernel's KVM_NMI behavior.
>>>>> - much complex
>>>>> - Require both kernel-site and userspace-site fix.
>>>>> - userspace-site should also handle the !KVM_SET_LINT1
>>>>> condition, it uses all the 2) approach' code. it means
>>>>> this approach equals the 2) approach + KVM_SET_LINT1 ioctl.
>>>>>
>>>>> This is an urgent bug of us, we need to settle it down soo
>>>>
>>>> While (1) is simple, it overloads a single ioctl with two meanings,
>>>> that's not so good.
>>>>
>>>> Whether we do (1) or (3), we need (2) as well, for older kernels.
>>>>
>>>> So I recommend first focusing on (2) and merging it, then doing (3).
>>>>
>>>> (note an additional issue with 3 is whether to make it a vm or vcpu
>>>> ioctl - we've been assuming vcpu ioctl but it's not necessarily the best
>>>> choice).
>>>>
>>>
>>> It is the 2) approach.
>>> It only changes the user space site, the kernel site is not touched.
>>> It is changed from previous v4 patch, fixed problems found by Jan.
>>> ----------------------------
>>>
>>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>>>
>>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>>> button event happens. This doesn't properly emulate real hardware on
>>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>>> the processor even when LINT1 is maskied in LVT. For example, this
>>> causes the problem that kdump initiated by NMI sometimes doesn't work
>>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>>
>>> With this patch, inject-nmi request is handled as follows.
>>>
>>> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
>>> interrupt.
>>> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states
>>> and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
>>> delivering the NMI directly. (Suggested by Jan Kiszka)
>>>
>>> Changed from old version:
>>> re-implement it by the Jan's suggestion.
>>> fix the race found by Jan.
>>>
>>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>>> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>>> ---
>>> hw/apic.c | 33 +++++++++++++++++++++++++++++++++
>>> hw/apic.h | 1 +
>>> monitor.c | 6 +++++-
>>> 3 files changed, 39 insertions(+), 1 deletions(-)
>>> diff --git a/hw/apic.c b/hw/apic.c
>>> index 69d6ac5..922796a 100644
>>> --- a/hw/apic.c
>>> +++ b/hw/apic.c
>>> @@ -205,6 +205,39 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
>>> }
>>> }
>>>
>>> +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id);
>>> +
>>> +static void kvm_irqchip_deliver_nmi(void *p)
>>> +{
>>> + APICState *s = p;
>>> + struct kvm_lapic_state klapic;
>>> + uint32_t lvt;
>>> +
>>> + kvm_get_lapic(s->cpu_env, &klapic);
>>> + lvt = kapic_reg(&klapic, 0x32 + APIC_LVT_LINT1);
>>> +
>>> + if (lvt & APIC_LVT_MASKED) {
>>> + return;
>>> + }
>>> +
>>> + if (((lvt >> 8) & 7) != APIC_DM_NMI) {
>>> + return;
>>> + }
>>> +
>>> + kvm_vcpu_ioctl(s->cpu_env, KVM_NMI);
>>> +}
>>> +
>>> +void apic_deliver_nmi(DeviceState *d)
>>> +{
>>> + APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
>>> +
>>> + if (kvm_irqchip_in_kernel()) {
>>> + run_on_cpu(s->cpu_env, kvm_irqchip_deliver_nmi, s);
>>> + } else {
>>> + apic_local_deliver(s, APIC_LVT_LINT1);
>>> + }
>>> +}
>>> +
>>> #define foreach_apic(apic, deliver_bitmask, code) \
>>> {\
>>> int __i, __j, __mask;\
>>> diff --git a/hw/apic.h b/hw/apic.h
>>> index c857d52..3a4be0a 100644
>>> --- a/hw/apic.h
>>> +++ b/hw/apic.h
>>> @@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
>>> uint8_t trigger_mode);
>>> int apic_accept_pic_intr(DeviceState *s);
>>> void apic_deliver_pic_intr(DeviceState *s, int level);
>>> +void apic_deliver_nmi(DeviceState *d);
>>> int apic_get_interrupt(DeviceState *s);
>>> void apic_reset_irq_delivered(void);
>>> int apic_get_irq_delivered(void);
>>> diff --git a/monitor.c b/monitor.c
>>> index cb485bf..0b81f17 100644
>>> --- a/monitor.c
>>> +++ b/monitor.c
>>> @@ -2616,7 +2616,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data)
>>> CPUState *env;
>>>
>>> for (env = first_cpu; env != NULL; env = env->next_cpu) {
>>> - cpu_interrupt(env, CPU_INTERRUPT_NMI);
>>> + if (!env->apic_state) {
>>> + cpu_interrupt(env, CPU_INTERRUPT_NMI);
>>> + } else {
>>> + apic_deliver_nmi(env->apic_state);
>>> + }
>>> }
>>>
>>> return 0;
>>
>> Looks OK to me.
>>
>> Please don't forget to bake a qemu-only patch for those bits that apply
>> to upstream as well (ie. the user space APIC path).
>>
>> Jan
>>
>
> I did forget it.
> Did you mean we need to add "#ifdef KVM_CAP_IRQCHIP" back?
No. I meant basically your patch minus the kvm_in_kernel_irqchip code
paths, applicable against current qemu.git. Those paths will be re-added
(slightly differently) when upstream gains that support. I'm working on
a basic version an will incorporate the logic if your qemu patch is
already available.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 69+ messages in thread
* [Qemu-devel] [PATCH 1/1 V6] qemu: fix improper nmi emulation
2011-10-19 10:57 ` Jan Kiszka
@ 2011-10-19 15:21 ` Lai Jiangshan
0 siblings, 0 replies; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-19 15:21 UTC (permalink / raw)
To: Jan Kiszka
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Avi Kivity,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 10/19/2011 06:57 PM, Jan Kiszka wrote:
>>>
>>> Looks OK to me.
>>>
>>> Please don't forget to bake a qemu-only patch for those bits that apply
>>> to upstream as well (ie. the user space APIC path).
>>>
>>> Jan
>>>
>>
>> I did forget it.
>> Did you mean we need to add "#ifdef KVM_CAP_IRQCHIP" back?
>
> No. I meant basically your patch minus the kvm_in_kernel_irqchip code
> paths, applicable against current qemu.git. Those paths will be re-added
> (slightly differently) when upstream gains that support. I'm working on
> a basic version an will incorporate the logic if your qemu patch is
> already available.
>
> Jan
>
Patch for qemu.git
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
button event happens. This doesn't properly emulate real hardware on
which NMI button event triggers LINT1. Because of this, NMI is sent to
the processor even when LINT1 is masked in LVT. For example, this
causes the problem that kdump initiated by NMI sometimes doesn't work
on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
With this patch, inject-nmi request is handled as delivering LINT1.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
---
hw/apic.c | 7 +++++++
hw/apic.h | 1 +
monitor.c | 6 +++++-
3 files changed, 13 insertions(+), 1 deletions(-)
diff --git a/hw/apic.c b/hw/apic.c
index 8289eef..c8dc997 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -205,6 +205,13 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
}
}
+void apic_deliver_nmi(DeviceState *d)
+{
+ APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+ apic_local_deliver(s, APIC_LVT_LINT1);
+}
+
#define foreach_apic(apic, deliver_bitmask, code) \
{\
int __i, __j, __mask;\
diff --git a/hw/apic.h b/hw/apic.h
index a5c910f..a62d83b 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -8,6 +8,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode,
uint8_t vector_num, uint8_t trigger_mode);
int apic_accept_pic_intr(DeviceState *s);
void apic_deliver_pic_intr(DeviceState *s, int level);
+void apic_deliver_nmi(DeviceState *d);
int apic_get_interrupt(DeviceState *s);
void apic_reset_irq_delivered(void);
int apic_get_irq_delivered(void);
diff --git a/monitor.c b/monitor.c
index ffda0fe..144099a 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2501,7 +2501,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data)
CPUState *env;
for (env = first_cpu; env != NULL; env = env->next_cpu) {
- cpu_interrupt(env, CPU_INTERRUPT_NMI);
+ if (!env->apic_state) {
+ cpu_interrupt(env, CPU_INTERRUPT_NMI);
+ } else {
+ apic_deliver_nmi(env->apic_state);
+ }
}
return 0;
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation
2011-10-18 19:41 ` Jan Kiszka
2011-10-19 6:33 ` Lai Jiangshan
@ 2011-10-19 9:29 ` Avi Kivity
2011-10-19 15:32 ` Lai Jiangshan
1 sibling, 1 reply; 69+ messages in thread
From: Avi Kivity @ 2011-10-19 9:29 UTC (permalink / raw)
To: Jan Kiszka
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Lai Jiangshan,
KAMEZAWA Hiroyuki, Kenji Kaneshige
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 10/18/2011 09:41 PM, Jan Kiszka wrote:
>
> Looks OK to me.
>
>
Same here.
- --
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQIcBAEBAgAGBQJOnpiNAAoJEI7yEDeUysxllqUP/3K9oPbz9OxbqH3+9G1W9cUy
49hKR0DtLyf5WH0hoSq3/jA2T00PWR6fLIo6itth76x/TqnIuimjln6Nrj/T2nhO
PPvwJB4OE/9ahSlm3JOVsE/JYwDx6h3u9eouN5BqVoQax8S3mnhxSGLxZOp8wvar
ol6vDj2U8JbigV3fCsFheiP9tTZWZgH66qCdCUzuNUnYWUW5m9repdsXflTp6YyW
id30xzuZETnQ/0RFU0hnhrfQ/vvm1dJeK6Y2bPKowoDCp+CFNi/CnJYDAZA18FSQ
V5096U8cj8/m/Hr8fPLpyZzDonPz0KfMPvtfV9rVHEtqvf04Ym+gcdfwo+2U4LQs
16RNGWwsF6qIAcyevK9xCpcU9g00v6m0fyj3eQgD+JT+pV+m8QCzNnQyDqDlEUEl
ub0WR7ilnl3/NIa6FTKHqZ5Wct8f9mO6wcCtJKXDTcHo/2uB5+kHzqJsLE2UCaXm
ptaiyFGZgGNpUocO+tYxeORWm4kNMoZRAaYmiU0RWaoIkQMY0P/m/Ghy+nZBUexM
vdH1lQ8DQoqQQxiC38MoO717rBOHDgxPoUGVPyPtU7qPhI2sSMYa2r+Uwi/Pmsm/
/dbKMbQs9q9pVkESBsmpkSLMVOrLQE/ju3h7iikZmY5RVrm+pI8fyOo9e20+/mKG
aO5IT5IDaHXAVk8jjAWB
=rMf/
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation
2011-10-19 9:29 ` [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: " Avi Kivity
@ 2011-10-19 15:32 ` Lai Jiangshan
0 siblings, 0 replies; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-19 15:32 UTC (permalink / raw)
To: Avi Kivity
Cc: kvm@vger.kernel.org, Jan Kiszka, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
On 10/19/2011 05:29 PM, Avi Kivity wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 10/18/2011 09:41 PM, Jan Kiszka wrote:
>>
>> Looks OK to me.
>>
>>
>
> Same here.
Who will merge it?
Thanks,
Lai
>
> - --
> I have a truly marvellous patch that fixes the bug which this
> signature is too narrow to contain.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iQIcBAEBAgAGBQJOnpiNAAoJEI7yEDeUysxllqUP/3K9oPbz9OxbqH3+9G1W9cUy
> 49hKR0DtLyf5WH0hoSq3/jA2T00PWR6fLIo6itth76x/TqnIuimjln6Nrj/T2nhO
> PPvwJB4OE/9ahSlm3JOVsE/JYwDx6h3u9eouN5BqVoQax8S3mnhxSGLxZOp8wvar
> ol6vDj2U8JbigV3fCsFheiP9tTZWZgH66qCdCUzuNUnYWUW5m9repdsXflTp6YyW
> id30xzuZETnQ/0RFU0hnhrfQ/vvm1dJeK6Y2bPKowoDCp+CFNi/CnJYDAZA18FSQ
> V5096U8cj8/m/Hr8fPLpyZzDonPz0KfMPvtfV9rVHEtqvf04Ym+gcdfwo+2U4LQs
> 16RNGWwsF6qIAcyevK9xCpcU9g00v6m0fyj3eQgD+JT+pV+m8QCzNnQyDqDlEUEl
> ub0WR7ilnl3/NIa6FTKHqZ5Wct8f9mO6wcCtJKXDTcHo/2uB5+kHzqJsLE2UCaXm
> ptaiyFGZgGNpUocO+tYxeORWm4kNMoZRAaYmiU0RWaoIkQMY0P/m/Ghy+nZBUexM
> vdH1lQ8DQoqQQxiC38MoO717rBOHDgxPoUGVPyPtU7qPhI2sSMYa2r+Uwi/Pmsm/
> /dbKMbQs9q9pVkESBsmpkSLMVOrLQE/ju3h7iikZmY5RVrm+pI8fyOo9e20+/mKG
> aO5IT5IDaHXAVk8jjAWB
> =rMf/
> -----END PGP SIGNATURE-----
>
>
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation
2011-10-17 16:00 ` [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: " Lai Jiangshan
2011-10-18 19:41 ` Jan Kiszka
@ 2011-12-07 10:29 ` Avi Kivity
2011-12-08 9:42 ` Jan Kiszka
1 sibling, 1 reply; 69+ messages in thread
From: Avi Kivity @ 2011-12-07 10:29 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, Jan Kiszka,
Sasha Levin, Kenji Kaneshige, KAMEZAWA Hiroyuki
On 10/17/2011 06:00 PM, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>
> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
> button event happens. This doesn't properly emulate real hardware on
> which NMI button event triggers LINT1. Because of this, NMI is sent to
> the processor even when LINT1 is maskied in LVT. For example, this
> causes the problem that kdump initiated by NMI sometimes doesn't work
> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>
> With this patch, inject-nmi request is handled as follows.
>
> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
> interrupt.
> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states
> and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
> delivering the NMI directly. (Suggested by Jan Kiszka)
>
> Changed from old version:
> re-implement it by the Jan's suggestion.
> fix the race found by Jan.
This patch fell through the cracks, sorry. Now applied.
Sasha, this patch highlights the issues with KVM_NMI.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation
2011-12-07 10:29 ` Avi Kivity
@ 2011-12-08 9:42 ` Jan Kiszka
2011-12-08 10:20 ` Jan Kiszka
0 siblings, 1 reply; 69+ messages in thread
From: Jan Kiszka @ 2011-12-08 9:42 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, Sasha Levin,
Kenji Kaneshige, KAMEZAWA Hiroyuki, Avi Kivity
On 2011-12-07 11:29, Avi Kivity wrote:
> On 10/17/2011 06:00 PM, Lai Jiangshan wrote:
>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>>
>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>> button event happens. This doesn't properly emulate real hardware on
>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>> the processor even when LINT1 is maskied in LVT. For example, this
>> causes the problem that kdump initiated by NMI sometimes doesn't work
>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>
>> With this patch, inject-nmi request is handled as follows.
>>
>> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
>> interrupt.
>> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states
>> and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
>> delivering the NMI directly. (Suggested by Jan Kiszka)
>>
>> Changed from old version:
>> re-implement it by the Jan's suggestion.
>> fix the race found by Jan.
>
> This patch fell through the cracks, sorry. Now applied.
Lai, what is the state of a corresponding QEMU upstream patch? I'd like
to build on top of it for my upstream irqchip series.
Thanks,
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation
2011-12-08 9:42 ` Jan Kiszka
@ 2011-12-08 10:20 ` Jan Kiszka
0 siblings, 0 replies; 69+ messages in thread
From: Jan Kiszka @ 2011-12-08 10:20 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, Sasha Levin,
Kenji Kaneshige, KAMEZAWA Hiroyuki, Avi Kivity
On 2011-12-08 10:42, Jan Kiszka wrote:
> On 2011-12-07 11:29, Avi Kivity wrote:
>> On 10/17/2011 06:00 PM, Lai Jiangshan wrote:
>>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>>>
>>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>>> button event happens. This doesn't properly emulate real hardware on
>>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>>> the processor even when LINT1 is maskied in LVT. For example, this
>>> causes the problem that kdump initiated by NMI sometimes doesn't work
>>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>>
>>> With this patch, inject-nmi request is handled as follows.
>>>
>>> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
>>> interrupt.
>>> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states
>>> and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
>>> delivering the NMI directly. (Suggested by Jan Kiszka)
>>>
>>> Changed from old version:
>>> re-implement it by the Jan's suggestion.
>>> fix the race found by Jan.
>>
>> This patch fell through the cracks, sorry. Now applied.
>
> Lai, what is the state of a corresponding QEMU upstream patch? I'd like
> to build on top of it for my upstream irqchip series.
Never mind, I'll include a patch in my series as it requires some
tweaking to the APIC backend concept.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 69+ messages in thread
* [Qemu-devel] [PATCH 1/2 V5] qemu-kvm: Synchronize kernel headers
2011-10-14 6:49 ` Jan Kiszka
2011-10-14 7:43 ` Lai Jiangshan
2011-10-14 9:03 ` [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and " Lai Jiangshan
@ 2011-10-14 9:03 ` Lai Jiangshan
2011-10-14 9:03 ` [Qemu-devel] [PATCH 2/2 V5] qemu-kvm: fix improper nmi emulation Lai Jiangshan
` (2 subsequent siblings)
5 siblings, 0 replies; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-14 9:03 UTC (permalink / raw)
To: Jan Kiszka
Cc: kvm@vger.kernel.org, Avi Kivity, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
Synchronize newest kernel headers which have
KVM_CAP_SET_LINT1 and KVM_SET_LINT1 by
./scripts/update-linux-headers.sh
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
| 19 +++++++++++++++++--
| 1 +
| 14 ++++++++++++++
| 26 +++++++++++++++++++-------
| 1 +
5 files changed, 52 insertions(+), 9 deletions(-)
--git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index 777d307..a4f6c85 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -22,6 +22,10 @@
#include <linux/types.h>
+/* Select powerpc specific features in <linux/kvm.h> */
+#define __KVM_HAVE_SPAPR_TCE
+#define __KVM_HAVE_PPC_SMT
+
struct kvm_regs {
__u64 pc;
__u64 cr;
@@ -166,8 +170,8 @@ struct kvm_sregs {
} ppc64;
struct {
__u32 sr[16];
- __u64 ibat[8];
- __u64 dbat[8];
+ __u64 ibat[8];
+ __u64 dbat[8];
} ppc32;
} s;
struct {
@@ -272,4 +276,15 @@ struct kvm_guest_debug_arch {
#define KVM_INTERRUPT_UNSET -2U
#define KVM_INTERRUPT_SET_LEVEL -3U
+/* for KVM_CAP_SPAPR_TCE */
+struct kvm_create_spapr_tce {
+ __u64 liobn;
+ __u32 window_size;
+};
+
+/* for KVM_ALLOCATE_RMA */
+struct kvm_allocate_rma {
+ __u64 rma_size;
+};
+
#endif /* __LINUX_KVM_POWERPC_H */
--git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index 4d8dcbd..88d0ac3 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -24,6 +24,7 @@
#define __KVM_HAVE_DEBUGREGS
#define __KVM_HAVE_XSAVE
#define __KVM_HAVE_XCRS
+#define __KVM_HAVE_SET_LINT1
/* Architectural interrupt line count. */
#define KVM_NR_INTERRUPTS 256
--git a/linux-headers/asm-x86/kvm_para.h b/linux-headers/asm-x86/kvm_para.h
index 834d71e..f2ac46a 100644
--- a/linux-headers/asm-x86/kvm_para.h
+++ b/linux-headers/asm-x86/kvm_para.h
@@ -21,6 +21,7 @@
*/
#define KVM_FEATURE_CLOCKSOURCE2 3
#define KVM_FEATURE_ASYNC_PF 4
+#define KVM_FEATURE_STEAL_TIME 5
/* The last 8 bits are used to indicate how to interpret the flags field
* in pvclock structure. If no bits are set, all flags are ignored.
@@ -30,10 +31,23 @@
#define MSR_KVM_WALL_CLOCK 0x11
#define MSR_KVM_SYSTEM_TIME 0x12
+#define KVM_MSR_ENABLED 1
/* Custom MSRs falls in the range 0x4b564d00-0x4b564dff */
#define MSR_KVM_WALL_CLOCK_NEW 0x4b564d00
#define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
#define MSR_KVM_ASYNC_PF_EN 0x4b564d02
+#define MSR_KVM_STEAL_TIME 0x4b564d03
+
+struct kvm_steal_time {
+ __u64 steal;
+ __u32 version;
+ __u32 flags;
+ __u32 pad[12];
+};
+
+#define KVM_STEAL_ALIGNMENT_BITS 5
+#define KVM_STEAL_VALID_BITS ((-1ULL << (KVM_STEAL_ALIGNMENT_BITS + 1)))
+#define KVM_STEAL_RESERVED_MASK (((1 << KVM_STEAL_ALIGNMENT_BITS) - 1 ) << 1)
#define KVM_MAX_MMU_OP_BATCH 32
--git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index fc63b73..86808b4 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -161,6 +161,7 @@ struct kvm_pit_config {
#define KVM_EXIT_NMI 16
#define KVM_EXIT_INTERNAL_ERROR 17
#define KVM_EXIT_OSI 18
+#define KVM_EXIT_PAPR_HCALL 19
/* For KVM_EXIT_INTERNAL_ERROR */
#define KVM_INTERNAL_ERROR_EMULATION 1
@@ -264,6 +265,11 @@ struct kvm_run {
struct {
__u64 gprs[32];
} osi;
+ struct {
+ __u64 nr;
+ __u64 ret;
+ __u64 args[9];
+ } papr_hcall;
/* Fix the size of the union. */
char padding[256];
};
@@ -544,6 +550,13 @@ struct kvm_ppc_pvinfo {
#define KVM_CAP_TSC_CONTROL 60
#define KVM_CAP_GET_TSC_KHZ 61
#define KVM_CAP_PPC_BOOKE_SREGS 62
+#define KVM_CAP_SPAPR_TCE 63
+#define KVM_CAP_PPC_SMT 64
+#define KVM_CAP_PPC_RMA 65
+#define KVM_CAP_S390_GMAP 71
+#ifdef __KVM_HAVE_SET_LINT1
+#define KVM_CAP_SET_LINT1 72
+#endif
#ifdef KVM_CAP_IRQ_ROUTING
@@ -746,6 +759,11 @@ struct kvm_clock_data {
/* Available with KVM_CAP_XCRS */
#define KVM_GET_XCRS _IOR(KVMIO, 0xa6, struct kvm_xcrs)
#define KVM_SET_XCRS _IOW(KVMIO, 0xa7, struct kvm_xcrs)
+#define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce)
+/* Available with KVM_CAP_RMA */
+#define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma)
+/* Available with KVM_CAP_SET_LINT1 for x86 */
+#define KVM_SET_LINT1 _IO(KVMIO, 0xaa)
#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
@@ -773,20 +791,14 @@ struct kvm_assigned_pci_dev {
struct kvm_assigned_irq {
__u32 assigned_dev_id;
- __u32 host_irq;
+ __u32 host_irq; /* ignored (legacy field) */
__u32 guest_irq;
__u32 flags;
union {
- struct {
- __u32 addr_lo;
- __u32 addr_hi;
- __u32 data;
- } guest_msi;
__u32 reserved[12];
};
};
-
struct kvm_assigned_msix_nr {
__u32 assigned_dev_id;
__u16 entry_nr;
--git a/linux-headers/linux/kvm_para.h b/linux-headers/linux/kvm_para.h
index 7bdcf93..b315e27 100644
--- a/linux-headers/linux/kvm_para.h
+++ b/linux-headers/linux/kvm_para.h
@@ -26,3 +26,4 @@
#include <asm/kvm_para.h>
#endif /* __LINUX_KVM_PARA_H */
+
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [Qemu-devel] [PATCH 2/2 V5] qemu-kvm: fix improper nmi emulation
2011-10-14 6:49 ` Jan Kiszka
` (2 preceding siblings ...)
2011-10-14 9:03 ` [Qemu-devel] [PATCH 1/2 V5] qemu-kvm: Synchronize kernel headers Lai Jiangshan
@ 2011-10-14 9:03 ` Lai Jiangshan
2011-10-14 9:22 ` Jan Kiszka
2011-10-14 9:51 ` [Qemu-devel] [PATCH 1/1 V5 tuning] kernel/kvm: introduce KVM_SET_LINT1 and " Lai Jiangshan
2011-10-14 9:51 ` [Qemu-devel] [PATCH 1/2 V5 tuning] qemu-kvm: Synchronize kernel headers Lai Jiangshan
5 siblings, 1 reply; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-14 9:03 UTC (permalink / raw)
To: Jan Kiszka
Cc: kvm@vger.kernel.org, Avi Kivity, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
button event happens. This doesn't properly emulate real hardware on
which NMI button event triggers LINT1. Because of this, NMI is sent to
the processor even when LINT1 is masked in LVT. For example, this
causes the problem that kdump initiated by NMI sometimes doesn't work
on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
With this patch, inject-nmi request is handled as follows.
- When in-kernel irqchip is enabled and KVM_SET_LINT1 is enabled,
inject LINT1 instead of NMI interrupt.
- otherwise when in-kernel irqchip is enabled, get the in-kernel
LAPIC states and test the APIC_LVT_MASKED, if LINT1 is unmasked,
and then delivering the NMI directly.
- otherwise, userland lapic emulates NMI button and inject NMI
if it is unmasked.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
---
hw/apic.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
hw/apic.h | 1 +
monitor.c | 6 ++++-
3 files changed, 78 insertions(+), 1 deletions(-)
diff --git a/hw/apic.c b/hw/apic.c
index 69d6ac5..91b82d0 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -205,6 +205,78 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
}
}
+#ifdef KVM_CAP_IRQCHIP
+static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id);
+
+static void kvm_irqchip_deliver_nmi(void *p)
+{
+ APICState *s = p;
+ struct kvm_lapic_state klapic;
+ uint32_t lvt;
+
+ kvm_get_lapic(s->cpu_env, &klapic);
+ lvt = kapic_reg(&klapic, 0x32 + APIC_LVT_LINT1);
+
+ if (lvt & APIC_LVT_MASKED) {
+ return;
+ }
+
+ if (((lvt >> 8) & 7) != APIC_DM_NMI) {
+ return;
+ }
+
+ kvm_vcpu_ioctl(s->cpu_env, KVM_NMI);
+}
+
+static void __apic_deliver_nmi(APICState *s)
+{
+ if (kvm_irqchip_in_kernel()) {
+ run_on_cpu(s->cpu_env, kvm_irqchip_deliver_nmi, s);
+ } else {
+ apic_local_deliver(s, APIC_LVT_LINT1);
+ }
+}
+#else
+static void __apic_deliver_nmi(APICState *s)
+{
+ apic_local_deliver(s, APIC_LVT_LINT1);
+}
+#endif
+
+enum {
+ KVM_SET_LINT1_UNKNOWN,
+ KVM_SET_LINT1_ENABLED,
+ KVM_SET_LINT1_DISABLED,
+};
+
+static void kvm_set_lint1(void *p)
+{
+ CPUState *env = p;
+
+ kvm_vcpu_ioctl(env, KVM_SET_LINT1);
+}
+
+void apic_deliver_nmi(DeviceState *d)
+{
+ APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+ static int kernel_lint1 = KVM_SET_LINT1_UNKNOWN;
+
+ if (kernel_lint1 == KVM_SET_LINT1_UNKNOWN) {
+ if (kvm_enabled() && kvm_irqchip_in_kernel() &&
+ kvm_check_extension(kvm_state, KVM_CAP_SET_LINT1)) {
+ kernel_lint1 = KVM_SET_LINT1_ENABLED;
+ } else {
+ kernel_lint1 = KVM_SET_LINT1_DISABLED;
+ }
+ }
+
+ if (kernel_lint1 == KVM_SET_LINT1_ENABLED) {
+ run_on_cpu(s->cpu_env, kvm_set_lint1, s->cpu_env);
+ } else {
+ __apic_deliver_nmi(s);
+ }
+}
+
#define foreach_apic(apic, deliver_bitmask, code) \
{\
int __i, __j, __mask;\
diff --git a/hw/apic.h b/hw/apic.h
index c857d52..3a4be0a 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
uint8_t trigger_mode);
int apic_accept_pic_intr(DeviceState *s);
void apic_deliver_pic_intr(DeviceState *s, int level);
+void apic_deliver_nmi(DeviceState *d);
int apic_get_interrupt(DeviceState *s);
void apic_reset_irq_delivered(void);
int apic_get_irq_delivered(void);
diff --git a/monitor.c b/monitor.c
index cb485bf..0b81f17 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2616,7 +2616,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data)
CPUState *env;
for (env = first_cpu; env != NULL; env = env->next_cpu) {
- cpu_interrupt(env, CPU_INTERRUPT_NMI);
+ if (!env->apic_state) {
+ cpu_interrupt(env, CPU_INTERRUPT_NMI);
+ } else {
+ apic_deliver_nmi(env->apic_state);
+ }
}
return 0;
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2 V5] qemu-kvm: fix improper nmi emulation
2011-10-14 9:03 ` [Qemu-devel] [PATCH 2/2 V5] qemu-kvm: fix improper nmi emulation Lai Jiangshan
@ 2011-10-14 9:22 ` Jan Kiszka
0 siblings, 0 replies; 69+ messages in thread
From: Jan Kiszka @ 2011-10-14 9:22 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, Avi Kivity, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
[-- Attachment #1: Type: text/plain, Size: 3670 bytes --]
On 2011-10-14 11:03, Lai Jiangshan wrote:
> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
> button event happens. This doesn't properly emulate real hardware on
> which NMI button event triggers LINT1. Because of this, NMI is sent to
> the processor even when LINT1 is masked in LVT. For example, this
> causes the problem that kdump initiated by NMI sometimes doesn't work
> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>
> With this patch, inject-nmi request is handled as follows.
>
> - When in-kernel irqchip is enabled and KVM_SET_LINT1 is enabled,
> inject LINT1 instead of NMI interrupt.
>
> - otherwise when in-kernel irqchip is enabled, get the in-kernel
> LAPIC states and test the APIC_LVT_MASKED, if LINT1 is unmasked,
> and then delivering the NMI directly.
>
> - otherwise, userland lapic emulates NMI button and inject NMI
> if it is unmasked.
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> ---
> hw/apic.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> hw/apic.h | 1 +
> monitor.c | 6 ++++-
> 3 files changed, 78 insertions(+), 1 deletions(-)
>
> diff --git a/hw/apic.c b/hw/apic.c
> index 69d6ac5..91b82d0 100644
> --- a/hw/apic.c
> +++ b/hw/apic.c
> @@ -205,6 +205,78 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
> }
> }
>
> +#ifdef KVM_CAP_IRQCHIP
Please read all my comments. That unfortunately also applies to the rest
of the patch.
> +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id);
> +
> +static void kvm_irqchip_deliver_nmi(void *p)
> +{
> + APICState *s = p;
> + struct kvm_lapic_state klapic;
> + uint32_t lvt;
> +
> + kvm_get_lapic(s->cpu_env, &klapic);
> + lvt = kapic_reg(&klapic, 0x32 + APIC_LVT_LINT1);
> +
> + if (lvt & APIC_LVT_MASKED) {
> + return;
> + }
> +
> + if (((lvt >> 8) & 7) != APIC_DM_NMI) {
> + return;
> + }
> +
> + kvm_vcpu_ioctl(s->cpu_env, KVM_NMI);
> +}
> +
> +static void __apic_deliver_nmi(APICState *s)
> +{
> + if (kvm_irqchip_in_kernel()) {
> + run_on_cpu(s->cpu_env, kvm_irqchip_deliver_nmi, s);
> + } else {
> + apic_local_deliver(s, APIC_LVT_LINT1);
> + }
> +}
> +#else
> +static void __apic_deliver_nmi(APICState *s)
> +{
> + apic_local_deliver(s, APIC_LVT_LINT1);
> +}
> +#endif
> +
> +enum {
> + KVM_SET_LINT1_UNKNOWN,
> + KVM_SET_LINT1_ENABLED,
> + KVM_SET_LINT1_DISABLED,
> +};
> +
> +static void kvm_set_lint1(void *p)
> +{
> + CPUState *env = p;
> +
> + kvm_vcpu_ioctl(env, KVM_SET_LINT1);
> +}
> +
> +void apic_deliver_nmi(DeviceState *d)
> +{
> + APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> + static int kernel_lint1 = KVM_SET_LINT1_UNKNOWN;
> +
> + if (kernel_lint1 == KVM_SET_LINT1_UNKNOWN) {
> + if (kvm_enabled() && kvm_irqchip_in_kernel() &&
> + kvm_check_extension(kvm_state, KVM_CAP_SET_LINT1)) {
That CAP test belongs where the injection shall happen. Here you decide
about user space vs. kernel space APIC model.
Let's try it together:
if kvm_enabled && kvm_irqchip_in_kernel
run_on_cpu(kvm_apic_deliver_nmi)
else
apic_local_deliver(APIC_LVT_LINT1)
with kvm_acpi_deliver_nmi like this:
if !check_extention(CAP_SET_LINT1)
get_kernel_apic_state
if !nmi_acceptable
return
kvm_vcpu_ioctl(KVM_NMI)
Please don't trust me blindly and re-check, but this is how the scenario
looks like to me.
Thanks for your patience,
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 69+ messages in thread
* [Qemu-devel] [PATCH 1/1 V5 tuning] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-14 6:49 ` Jan Kiszka
` (3 preceding siblings ...)
2011-10-14 9:03 ` [Qemu-devel] [PATCH 2/2 V5] qemu-kvm: fix improper nmi emulation Lai Jiangshan
@ 2011-10-14 9:51 ` Lai Jiangshan
2011-10-14 11:59 ` Sasha Levin
2011-10-14 9:51 ` [Qemu-devel] [PATCH 1/2 V5 tuning] qemu-kvm: Synchronize kernel headers Lai Jiangshan
5 siblings, 1 reply; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-14 9:51 UTC (permalink / raw)
To: Jan Kiszka
Cc: kvm@vger.kernel.org, Avi Kivity, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
button event happens. This doesn't properly emulate real hardware on
which NMI button event triggers LINT1. Because of this, NMI is sent to
the processor even when LINT1 is masked in LVT. For example, this
causes the problem that kdump initiated by NMI sometimes doesn't work
on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
With this patch, we introduce introduce KVM_SET_LINT1,
and we can use KVM_SET_LINT1 to correctly emulate NMI button
without change the old KVM_NMI behavior.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
---
arch/x86/kvm/irq.h | 1 +
arch/x86/kvm/lapic.c | 7 +++++++
arch/x86/kvm/x86.c | 8 ++++++++
include/linux/kvm.h | 3 +++
4 files changed, 19 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
index 53e2d08..0c96315 100644
--- a/arch/x86/kvm/irq.h
+++ b/arch/x86/kvm/irq.h
@@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s);
void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
+void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 57dcbd4..87fe36a 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
kvm_apic_local_deliver(apic, APIC_LVT0);
}
+void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
+{
+ struct kvm_lapic *apic = vcpu->arch.apic;
+
+ kvm_apic_local_deliver(apic, APIC_LVT1);
+}
+
static struct kvm_timer_ops lapic_timer_ops = {
.is_periodic = lapic_is_periodic,
};
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 84a28ea..fccd094 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_XSAVE:
case KVM_CAP_ASYNC_PF:
case KVM_CAP_GET_TSC_KHZ:
+ case KVM_CAP_SET_LINT1:
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
@@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
goto out;
}
+ case KVM_SET_LINT1: {
+ r = -EINVAL;
+ if (!irqchip_in_kernel(vcpu->kvm))
+ goto out;
+ r = 0;
+ kvm_apic_lint1_deliver(vcpu);
+ }
default:
r = -EINVAL;
}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index aace6b8..11a2c42 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -554,6 +554,7 @@ struct kvm_ppc_pvinfo {
#define KVM_CAP_PPC_SMT 64
#define KVM_CAP_PPC_RMA 65
#define KVM_CAP_S390_GMAP 71
+#define KVM_CAP_SET_LINT1 72
#ifdef KVM_CAP_IRQ_ROUTING
@@ -759,6 +760,8 @@ struct kvm_clock_data {
#define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce)
/* Available with KVM_CAP_RMA */
#define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma)
+/* Available with KVM_CAP_SET_LINT1 for x86 */
+#define KVM_SET_LINT1 _IO(KVMIO, 0xaa)
#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5 tuning] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-14 9:51 ` [Qemu-devel] [PATCH 1/1 V5 tuning] kernel/kvm: introduce KVM_SET_LINT1 and " Lai Jiangshan
@ 2011-10-14 11:59 ` Sasha Levin
2011-10-14 12:07 ` Jan Kiszka
0 siblings, 1 reply; 69+ messages in thread
From: Sasha Levin @ 2011-10-14 11:59 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, Jan Kiszka,
Avi Kivity, Kenji Kaneshige, KAMEZAWA Hiroyuki
On Fri, 2011-10-14 at 17:51 +0800, Lai Jiangshan wrote:
> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
> button event happens. This doesn't properly emulate real hardware on
> which NMI button event triggers LINT1. Because of this, NMI is sent to
> the processor even when LINT1 is masked in LVT. For example, this
> causes the problem that kdump initiated by NMI sometimes doesn't work
> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>
> With this patch, we introduce introduce KVM_SET_LINT1,
> and we can use KVM_SET_LINT1 to correctly emulate NMI button
> without change the old KVM_NMI behavior.
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> ---
It could use a documentation update as well.
> arch/x86/kvm/irq.h | 1 +
> arch/x86/kvm/lapic.c | 7 +++++++
> arch/x86/kvm/x86.c | 8 ++++++++
> include/linux/kvm.h | 3 +++
> 4 files changed, 19 insertions(+), 0 deletions(-)
> diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
> index 53e2d08..0c96315 100644
> --- a/arch/x86/kvm/irq.h
> +++ b/arch/x86/kvm/irq.h
> @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s);
> void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
> void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
> void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
> void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
> void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
> void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 57dcbd4..87fe36a 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
> kvm_apic_local_deliver(apic, APIC_LVT0);
> }
>
> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_lapic *apic = vcpu->arch.apic;
> +
> + kvm_apic_local_deliver(apic, APIC_LVT1);
> +}
> +
> static struct kvm_timer_ops lapic_timer_ops = {
> .is_periodic = lapic_is_periodic,
> };
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 84a28ea..fccd094 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> case KVM_CAP_XSAVE:
> case KVM_CAP_ASYNC_PF:
> case KVM_CAP_GET_TSC_KHZ:
> + case KVM_CAP_SET_LINT1:
> r = 1;
> break;
> case KVM_CAP_COALESCED_MMIO:
> @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>
> goto out;
> }
> + case KVM_SET_LINT1: {
> + r = -EINVAL;
> + if (!irqchip_in_kernel(vcpu->kvm))
> + goto out;
> + r = 0;
> + kvm_apic_lint1_deliver(vcpu);
We simply ignore the return value of kvm_apic_local_deliver() and assume
it always works. why?
--
Sasha.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5 tuning] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-14 11:59 ` Sasha Levin
@ 2011-10-14 12:07 ` Jan Kiszka
2011-10-16 15:01 ` Lai Jiangshan
0 siblings, 1 reply; 69+ messages in thread
From: Jan Kiszka @ 2011-10-14 12:07 UTC (permalink / raw)
To: Lai Jiangshan
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, Sasha Levin,
Kenji Kaneshige, KAMEZAWA Hiroyuki, Avi Kivity
On 2011-10-14 13:59, Sasha Levin wrote:
> On Fri, 2011-10-14 at 17:51 +0800, Lai Jiangshan wrote:
>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>> button event happens. This doesn't properly emulate real hardware on
>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>> the processor even when LINT1 is masked in LVT. For example, this
>> causes the problem that kdump initiated by NMI sometimes doesn't work
>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>
>> With this patch, we introduce introduce KVM_SET_LINT1,
>> and we can use KVM_SET_LINT1 to correctly emulate NMI button
>> without change the old KVM_NMI behavior.
>>
>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>> ---
>
> It could use a documentation update as well.
>
>> arch/x86/kvm/irq.h | 1 +
>> arch/x86/kvm/lapic.c | 7 +++++++
>> arch/x86/kvm/x86.c | 8 ++++++++
>> include/linux/kvm.h | 3 +++
>> 4 files changed, 19 insertions(+), 0 deletions(-)
>> diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
>> index 53e2d08..0c96315 100644
>> --- a/arch/x86/kvm/irq.h
>> +++ b/arch/x86/kvm/irq.h
>> @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s);
>> void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
>> void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
>> void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
>> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
>> void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
>> void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
>> void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 57dcbd4..87fe36a 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
>> kvm_apic_local_deliver(apic, APIC_LVT0);
>> }
>>
>> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm_lapic *apic = vcpu->arch.apic;
>> +
>> + kvm_apic_local_deliver(apic, APIC_LVT1);
>> +}
>> +
>> static struct kvm_timer_ops lapic_timer_ops = {
>> .is_periodic = lapic_is_periodic,
>> };
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 84a28ea..fccd094 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>> case KVM_CAP_XSAVE:
>> case KVM_CAP_ASYNC_PF:
>> case KVM_CAP_GET_TSC_KHZ:
>> + case KVM_CAP_SET_LINT1:
>> r = 1;
>> break;
>> case KVM_CAP_COALESCED_MMIO:
>> @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>
>> goto out;
>> }
>> + case KVM_SET_LINT1: {
>> + r = -EINVAL;
>> + if (!irqchip_in_kernel(vcpu->kvm))
>> + goto out;
>> + r = 0;
>> + kvm_apic_lint1_deliver(vcpu);
>
> We simply ignore the return value of kvm_apic_local_deliver() and assume
> it always works. why?
>
Hmm, I suddenly realized that we switched from enhancing the KVM_NMI
IOCTL to adding KVM_SET_LINT1 - what motivated this?
( Maybe we should let the kernel part settle first before iterating
through user space changes. )
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V5 tuning] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
2011-10-14 12:07 ` Jan Kiszka
@ 2011-10-16 15:01 ` Lai Jiangshan
0 siblings, 0 replies; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-16 15:01 UTC (permalink / raw)
To: Jan Kiszka
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, Sasha Levin,
Kenji Kaneshige, KAMEZAWA Hiroyuki, Avi Kivity
On 10/14/2011 08:07 PM, Jan Kiszka wrote:
> On 2011-10-14 13:59, Sasha Levin wrote:
>> On Fri, 2011-10-14 at 17:51 +0800, Lai Jiangshan wrote:
>>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>>> button event happens. This doesn't properly emulate real hardware on
>>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>>> the processor even when LINT1 is masked in LVT. For example, this
>>> causes the problem that kdump initiated by NMI sometimes doesn't work
>>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>>
>>> With this patch, we introduce introduce KVM_SET_LINT1,
>>> and we can use KVM_SET_LINT1 to correctly emulate NMI button
>>> without change the old KVM_NMI behavior.
>>>
>>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>>> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>>> ---
>>
>> It could use a documentation update as well.
>>
>>> arch/x86/kvm/irq.h | 1 +
>>> arch/x86/kvm/lapic.c | 7 +++++++
>>> arch/x86/kvm/x86.c | 8 ++++++++
>>> include/linux/kvm.h | 3 +++
>>> 4 files changed, 19 insertions(+), 0 deletions(-)
>>> diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
>>> index 53e2d08..0c96315 100644
>>> --- a/arch/x86/kvm/irq.h
>>> +++ b/arch/x86/kvm/irq.h
>>> @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s);
>>> void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
>>> void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
>>> void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
>>> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
>>> void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
>>> void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
>>> void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>> index 57dcbd4..87fe36a 100644
>>> --- a/arch/x86/kvm/lapic.c
>>> +++ b/arch/x86/kvm/lapic.c
>>> @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
>>> kvm_apic_local_deliver(apic, APIC_LVT0);
>>> }
>>>
>>> +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
>>> +{
>>> + struct kvm_lapic *apic = vcpu->arch.apic;
>>> +
>>> + kvm_apic_local_deliver(apic, APIC_LVT1);
>>> +}
>>> +
>>> static struct kvm_timer_ops lapic_timer_ops = {
>>> .is_periodic = lapic_is_periodic,
>>> };
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 84a28ea..fccd094 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>>> case KVM_CAP_XSAVE:
>>> case KVM_CAP_ASYNC_PF:
>>> case KVM_CAP_GET_TSC_KHZ:
>>> + case KVM_CAP_SET_LINT1:
>>> r = 1;
>>> break;
>>> case KVM_CAP_COALESCED_MMIO:
>>> @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>>
>>> goto out;
>>> }
>>> + case KVM_SET_LINT1: {
>>> + r = -EINVAL;
>>> + if (!irqchip_in_kernel(vcpu->kvm))
>>> + goto out;
>>> + r = 0;
>>> + kvm_apic_lint1_deliver(vcpu);
>>
>> We simply ignore the return value of kvm_apic_local_deliver() and assume
>> it always works. why?
>>
>
> Hmm, I suddenly realized that we switched from enhancing the KVM_NMI
> IOCTL to adding KVM_SET_LINT1 - what motivated this?
Enhancing the KVM_NMI directly fixes the problem and matches the
real hard ware more, but it changes API bahavior.(we preferred to this one)
>From the previous mails, I found you and Avi prefer to SET_LINT1
which keep old behavior and it is also OK for us.
But I found it is hard to be implemented before, and I switched
this one when you told me the clue.
>
> ( Maybe we should let the kernel part settle first before iterating
> through user space changes. )
>
Yes, you are right, we should settle the kernel-site at first,
But I need you and Avi's suggestions.
Thanks,
Lai
^ permalink raw reply [flat|nested] 69+ messages in thread
* [Qemu-devel] [PATCH 1/2 V5 tuning] qemu-kvm: Synchronize kernel headers
2011-10-14 6:49 ` Jan Kiszka
` (4 preceding siblings ...)
2011-10-14 9:51 ` [Qemu-devel] [PATCH 1/1 V5 tuning] kernel/kvm: introduce KVM_SET_LINT1 and " Lai Jiangshan
@ 2011-10-14 9:51 ` Lai Jiangshan
5 siblings, 0 replies; 69+ messages in thread
From: Lai Jiangshan @ 2011-10-14 9:51 UTC (permalink / raw)
To: Jan Kiszka
Cc: kvm@vger.kernel.org, Avi Kivity, qemu-devel@nongnu.org,
KAMEZAWA Hiroyuki, Kenji Kaneshige
Synchronize newest kernel headers which have
KVM_CAP_SET_LINT1 and KVM_SET_LINT1 by
./scripts/update-linux-headers.sh
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
| 19 +++++++++++++++++--
| 14 ++++++++++++++
| 24 +++++++++++++++++-------
| 1 +
4 files changed, 49 insertions(+), 9 deletions(-)
--git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index 777d307..a4f6c85 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -22,6 +22,10 @@
#include <linux/types.h>
+/* Select powerpc specific features in <linux/kvm.h> */
+#define __KVM_HAVE_SPAPR_TCE
+#define __KVM_HAVE_PPC_SMT
+
struct kvm_regs {
__u64 pc;
__u64 cr;
@@ -166,8 +170,8 @@ struct kvm_sregs {
} ppc64;
struct {
__u32 sr[16];
- __u64 ibat[8];
- __u64 dbat[8];
+ __u64 ibat[8];
+ __u64 dbat[8];
} ppc32;
} s;
struct {
@@ -272,4 +276,15 @@ struct kvm_guest_debug_arch {
#define KVM_INTERRUPT_UNSET -2U
#define KVM_INTERRUPT_SET_LEVEL -3U
+/* for KVM_CAP_SPAPR_TCE */
+struct kvm_create_spapr_tce {
+ __u64 liobn;
+ __u32 window_size;
+};
+
+/* for KVM_ALLOCATE_RMA */
+struct kvm_allocate_rma {
+ __u64 rma_size;
+};
+
#endif /* __LINUX_KVM_POWERPC_H */
--git a/linux-headers/asm-x86/kvm_para.h b/linux-headers/asm-x86/kvm_para.h
index 834d71e..f2ac46a 100644
--- a/linux-headers/asm-x86/kvm_para.h
+++ b/linux-headers/asm-x86/kvm_para.h
@@ -21,6 +21,7 @@
*/
#define KVM_FEATURE_CLOCKSOURCE2 3
#define KVM_FEATURE_ASYNC_PF 4
+#define KVM_FEATURE_STEAL_TIME 5
/* The last 8 bits are used to indicate how to interpret the flags field
* in pvclock structure. If no bits are set, all flags are ignored.
@@ -30,10 +31,23 @@
#define MSR_KVM_WALL_CLOCK 0x11
#define MSR_KVM_SYSTEM_TIME 0x12
+#define KVM_MSR_ENABLED 1
/* Custom MSRs falls in the range 0x4b564d00-0x4b564dff */
#define MSR_KVM_WALL_CLOCK_NEW 0x4b564d00
#define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
#define MSR_KVM_ASYNC_PF_EN 0x4b564d02
+#define MSR_KVM_STEAL_TIME 0x4b564d03
+
+struct kvm_steal_time {
+ __u64 steal;
+ __u32 version;
+ __u32 flags;
+ __u32 pad[12];
+};
+
+#define KVM_STEAL_ALIGNMENT_BITS 5
+#define KVM_STEAL_VALID_BITS ((-1ULL << (KVM_STEAL_ALIGNMENT_BITS + 1)))
+#define KVM_STEAL_RESERVED_MASK (((1 << KVM_STEAL_ALIGNMENT_BITS) - 1 ) << 1)
#define KVM_MAX_MMU_OP_BATCH 32
--git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index fc63b73..0fd246f 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -161,6 +161,7 @@ struct kvm_pit_config {
#define KVM_EXIT_NMI 16
#define KVM_EXIT_INTERNAL_ERROR 17
#define KVM_EXIT_OSI 18
+#define KVM_EXIT_PAPR_HCALL 19
/* For KVM_EXIT_INTERNAL_ERROR */
#define KVM_INTERNAL_ERROR_EMULATION 1
@@ -264,6 +265,11 @@ struct kvm_run {
struct {
__u64 gprs[32];
} osi;
+ struct {
+ __u64 nr;
+ __u64 ret;
+ __u64 args[9];
+ } papr_hcall;
/* Fix the size of the union. */
char padding[256];
};
@@ -544,6 +550,11 @@ struct kvm_ppc_pvinfo {
#define KVM_CAP_TSC_CONTROL 60
#define KVM_CAP_GET_TSC_KHZ 61
#define KVM_CAP_PPC_BOOKE_SREGS 62
+#define KVM_CAP_SPAPR_TCE 63
+#define KVM_CAP_PPC_SMT 64
+#define KVM_CAP_PPC_RMA 65
+#define KVM_CAP_S390_GMAP 71
+#define KVM_CAP_SET_LINT1 72
#ifdef KVM_CAP_IRQ_ROUTING
@@ -746,6 +757,11 @@ struct kvm_clock_data {
/* Available with KVM_CAP_XCRS */
#define KVM_GET_XCRS _IOR(KVMIO, 0xa6, struct kvm_xcrs)
#define KVM_SET_XCRS _IOW(KVMIO, 0xa7, struct kvm_xcrs)
+#define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce)
+/* Available with KVM_CAP_RMA */
+#define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma)
+/* Available with KVM_CAP_SET_LINT1 for x86 */
+#define KVM_SET_LINT1 _IO(KVMIO, 0xaa)
#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
@@ -773,20 +789,14 @@ struct kvm_assigned_pci_dev {
struct kvm_assigned_irq {
__u32 assigned_dev_id;
- __u32 host_irq;
+ __u32 host_irq; /* ignored (legacy field) */
__u32 guest_irq;
__u32 flags;
union {
- struct {
- __u32 addr_lo;
- __u32 addr_hi;
- __u32 data;
- } guest_msi;
__u32 reserved[12];
};
};
-
struct kvm_assigned_msix_nr {
__u32 assigned_dev_id;
__u16 entry_nr;
--git a/linux-headers/linux/kvm_para.h b/linux-headers/linux/kvm_para.h
index 7bdcf93..b315e27 100644
--- a/linux-headers/linux/kvm_para.h
+++ b/linux-headers/linux/kvm_para.h
@@ -26,3 +26,4 @@
#include <asm/kvm_para.h>
#endif /* __LINUX_KVM_PARA_H */
+
^ permalink raw reply related [flat|nested] 69+ messages in thread