qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
       [not found]     ` <4D3303FD.8020509@redhat.com>
@ 2011-01-18  3:03       ` Stefan Berger
  2011-01-18  8:53         ` Jan Kiszka
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Berger @ 2011-01-18  3:03 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel, kvm

On 01/16/2011 09:43 AM, Avi Kivity wrote:
> On 01/14/2011 09:27 PM, Stefan Berger wrote:
>>
>>>
>>> Can you sprinkle some printfs() arount kvm_run (in qemu-kvm.c) to 
>>> verify this?
>>>
>> Here's what I did:
>>
>>
>> interrupt exit requested
>
> It appears from this you're using qemu.git.  Please try qemu-kvm.git, 
> where the code appears to be correct.
>
Cc'ing qemu-devel now. For reference, here the initial problem description:

http://www.spinics.net/lists/kvm/msg48274.html

I didn't know there was another tree...

I have seen now a couple of suspends-while-reading with patches applied 
to the qemu-kvm.git tree and indeed, when run with the same host kernel 
and VM I do not see the debugging dumps due to double-reads that I would 
have anticipated seeing by now. Now what? Can this be easily fixed in 
the other Qemu tree as well?

One thing I'd like to mention is that I have seen what I think are 
interrupt stalls when running my tests inside the qemu-kvm.git tree 
version and not suspending at all. A some point the interrupt counter in 
the guest kernel does not increase anymore even though I see the device 
model raising the IRQ and lowering it. The same tests run literally 
forever in the qemu.git tree version of Qemu.

Regards,
    Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-18  3:03       ` [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations Stefan Berger
@ 2011-01-18  8:53         ` Jan Kiszka
  2011-01-24 18:27           ` Stefan Berger
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kiszka @ 2011-01-18  8:53 UTC (permalink / raw)
  To: Stefan Berger; +Cc: Avi Kivity, kvm, qemu-devel

On 2011-01-18 04:03, Stefan Berger wrote:
> On 01/16/2011 09:43 AM, Avi Kivity wrote:
>> On 01/14/2011 09:27 PM, Stefan Berger wrote:
>>>
>>>>
>>>> Can you sprinkle some printfs() arount kvm_run (in qemu-kvm.c) to
>>>> verify this?
>>>>
>>> Here's what I did:
>>>
>>>
>>> interrupt exit requested
>>
>> It appears from this you're using qemu.git.  Please try qemu-kvm.git,
>> where the code appears to be correct.
>>
> Cc'ing qemu-devel now. For reference, here the initial problem description:
> 
> http://www.spinics.net/lists/kvm/msg48274.html
> 
> I didn't know there was another tree...
> 
> I have seen now a couple of suspends-while-reading with patches applied
> to the qemu-kvm.git tree and indeed, when run with the same host kernel
> and VM I do not see the debugging dumps due to double-reads that I would
> have anticipated seeing by now. Now what? Can this be easily fixed in
> the other Qemu tree as well?

Please give this a try:

git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream

I bet (& hope) "kvm: Unconditionally reenter kernel after IO exits"
fixes the issue for you. If other problems pop up with that tree, also
try resetting to that particular commit.

I'm currently trying to shake all those hidden or forgotten bug fixes
out of qemu-kvm and port them upstream. Most of those subtle differences
should hopefully soon be history.

> 
> One thing I'd like to mention is that I have seen what I think are
> interrupt stalls when running my tests inside the qemu-kvm.git tree
> version and not suspending at all. A some point the interrupt counter in
> the guest kernel does not increase anymore even though I see the device
> model raising the IRQ and lowering it. The same tests run literally
> forever in the qemu.git tree version of Qemu.

What about qemu-kmv and -no-kvm-irqchip?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-18  8:53         ` Jan Kiszka
@ 2011-01-24 18:27           ` Stefan Berger
  2011-01-24 22:34             ` Jan Kiszka
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Berger @ 2011-01-24 18:27 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm, qemu-devel

On 01/18/2011 03:53 AM, Jan Kiszka wrote:
> On 2011-01-18 04:03, Stefan Berger wrote:
>> On 01/16/2011 09:43 AM, Avi Kivity wrote:
>>> On 01/14/2011 09:27 PM, Stefan Berger wrote:
>>>>> Can you sprinkle some printfs() arount kvm_run (in qemu-kvm.c) to
>>>>> verify this?
>>>>>
>>>> Here's what I did:
>>>>
>>>>
>>>> interrupt exit requested
>>> It appears from this you're using qemu.git.  Please try qemu-kvm.git,
>>> where the code appears to be correct.
>>>
>> Cc'ing qemu-devel now. For reference, here the initial problem description:
>>
>> http://www.spinics.net/lists/kvm/msg48274.html
>>
>> I didn't know there was another tree...
>>
>> I have seen now a couple of suspends-while-reading with patches applied
>> to the qemu-kvm.git tree and indeed, when run with the same host kernel
>> and VM I do not see the debugging dumps due to double-reads that I would
>> have anticipated seeing by now. Now what? Can this be easily fixed in
>> the other Qemu tree as well?
> Please give this a try:
>
> git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream
>
> I bet (&  hope) "kvm: Unconditionally reenter kernel after IO exits"
> fixes the issue for you. If other problems pop up with that tree, also
> try resetting to that particular commit.
>
> I'm currently trying to shake all those hidden or forgotten bug fixes
> out of qemu-kvm and port them upstream. Most of those subtle differences
> should hopefully soon be history.
>
I did the same test as I did with Avi's tree and haven't seen the 
consequences of possible double-reads. So, I would say that you should 
upstream those patches...

I searched for the text you mention above using 'gitk' but couldn't find 
a patch with that headline in your tree. There were others that seem to 
be related:

Gleb Natapov: "do not enter vcpu again if it was stopped during IO"
>> One thing I'd like to mention is that I have seen what I think are
>> interrupt stalls when running my tests inside the qemu-kvm.git tree
>> version and not suspending at all. A some point the interrupt counter in
>> the guest kernel does not increase anymore even though I see the device
>> model raising the IRQ and lowering it. The same tests run literally
>> forever in the qemu.git tree version of Qemu.
> What about qemu-kmv and -no-kvm-irqchip?
That seems to be necessary for both trees, yours and the one Avi pointed 
me to. If applied, then I did not see the interrupt problem.

     Stefan
> Jan
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-24 18:27           ` Stefan Berger
@ 2011-01-24 22:34             ` Jan Kiszka
  2011-01-25  3:13               ` Stefan Berger
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kiszka @ 2011-01-24 22:34 UTC (permalink / raw)
  To: Stefan Berger; +Cc: Avi Kivity, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3043 bytes --]

On 2011-01-24 19:27, Stefan Berger wrote:
> On 01/18/2011 03:53 AM, Jan Kiszka wrote:
>> On 2011-01-18 04:03, Stefan Berger wrote:
>>> On 01/16/2011 09:43 AM, Avi Kivity wrote:
>>>> On 01/14/2011 09:27 PM, Stefan Berger wrote:
>>>>>> Can you sprinkle some printfs() arount kvm_run (in qemu-kvm.c) to
>>>>>> verify this?
>>>>>>
>>>>> Here's what I did:
>>>>>
>>>>>
>>>>> interrupt exit requested
>>>> It appears from this you're using qemu.git.  Please try qemu-kvm.git,
>>>> where the code appears to be correct.
>>>>
>>> Cc'ing qemu-devel now. For reference, here the initial problem
>>> description:
>>>
>>> http://www.spinics.net/lists/kvm/msg48274.html
>>>
>>> I didn't know there was another tree...
>>>
>>> I have seen now a couple of suspends-while-reading with patches applied
>>> to the qemu-kvm.git tree and indeed, when run with the same host kernel
>>> and VM I do not see the debugging dumps due to double-reads that I would
>>> have anticipated seeing by now. Now what? Can this be easily fixed in
>>> the other Qemu tree as well?
>> Please give this a try:
>>
>> git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream
>>
>> I bet (&  hope) "kvm: Unconditionally reenter kernel after IO exits"
>> fixes the issue for you. If other problems pop up with that tree, also
>> try resetting to that particular commit.
>>
>> I'm currently trying to shake all those hidden or forgotten bug fixes
>> out of qemu-kvm and port them upstream. Most of those subtle differences
>> should hopefully soon be history.
>>
> I did the same test as I did with Avi's tree and haven't seen the
> consequences of possible double-reads. So, I would say that you should
> upstream those patches...
> 
> I searched for the text you mention above using 'gitk' but couldn't find
> a patch with that headline in your tree. There were others that seem to
> be related:
> 
> Gleb Natapov: "do not enter vcpu again if it was stopped during IO"

Err, I don't think you checked out queues/kvm-upstream. I bet you just
ran my master branch which is a version of qemu-kvm's master. Am I right? :)

>>> One thing I'd like to mention is that I have seen what I think are
>>> interrupt stalls when running my tests inside the qemu-kvm.git tree
>>> version and not suspending at all. A some point the interrupt counter in
>>> the guest kernel does not increase anymore even though I see the device
>>> model raising the IRQ and lowering it. The same tests run literally
>>> forever in the qemu.git tree version of Qemu.
>> What about qemu-kmv and -no-kvm-irqchip?
> That seems to be necessary for both trees, yours and the one Avi pointed
> me to. If applied, then I did not see the interrupt problem.

And the fact that you were able to call qemu from my tree with
-no-kvm-irqchip just underlines my assumption: that switch is refused by
upstream. Please retry with the latest kvm-upstream queue.

Besides that, this other bug you may see in the in-kernel IRQ path - how
can we reproduce it?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-24 22:34             ` Jan Kiszka
@ 2011-01-25  3:13               ` Stefan Berger
  2011-01-25  7:26                 ` Jan Kiszka
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Berger @ 2011-01-25  3:13 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm, qemu-devel

On 01/24/2011 05:34 PM, Jan Kiszka wrote:
> On 2011-01-24 19:27, Stefan Berger wrote:
>> On 01/18/2011 03:53 AM, Jan Kiszka wrote:
>>> On 2011-01-18 04:03, Stefan Berger wrote:
>>>> On 01/16/2011 09:43 AM, Avi Kivity wrote:
>>>>> On 01/14/2011 09:27 PM, Stefan Berger wrote:
>>>>>>> Can you sprinkle some printfs() arount kvm_run (in qemu-kvm.c) to
>>>>>>> verify this?
>>>>>>>
>>>>>> Here's what I did:
>>>>>>
>>>>>>
>>>>>> interrupt exit requested
>>>>> It appears from this you're using qemu.git.  Please try qemu-kvm.git,
>>>>> where the code appears to be correct.
>>>>>
>>>> Cc'ing qemu-devel now. For reference, here the initial problem
>>>> description:
>>>>
>>>> http://www.spinics.net/lists/kvm/msg48274.html
>>>>
>>>> I didn't know there was another tree...
>>>>
>>>> I have seen now a couple of suspends-while-reading with patches applied
>>>> to the qemu-kvm.git tree and indeed, when run with the same host kernel
>>>> and VM I do not see the debugging dumps due to double-reads that I would
>>>> have anticipated seeing by now. Now what? Can this be easily fixed in
>>>> the other Qemu tree as well?
>>> Please give this a try:
>>>
>>> git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream
>>>
>>> I bet (&   hope) "kvm: Unconditionally reenter kernel after IO exits"
>>> fixes the issue for you. If other problems pop up with that tree, also
>>> try resetting to that particular commit.
>>>
>>> I'm currently trying to shake all those hidden or forgotten bug fixes
>>> out of qemu-kvm and port them upstream. Most of those subtle differences
>>> should hopefully soon be history.
>>>
>> I did the same test as I did with Avi's tree and haven't seen the
>> consequences of possible double-reads. So, I would say that you should
>> upstream those patches...
>>
>> I searched for the text you mention above using 'gitk' but couldn't find
>> a patch with that headline in your tree. There were others that seem to
>> be related:
>>
>> Gleb Natapov: "do not enter vcpu again if it was stopped during IO"
> Err, I don't think you checked out queues/kvm-upstream. I bet you just
> ran my master branch which is a version of qemu-kvm's master. Am I right? :)
>

You're right. :-) my lack of git knowledge -  checked out the branch now.

I redid the testing and it passed. No double-reads and lost bytes from 
what I could see.

>>>> One thing I'd like to mention is that I have seen what I think are
>>>> interrupt stalls when running my tests inside the qemu-kvm.git tree
>>>> version and not suspending at all. A some point the interrupt counter in
>>>> the guest kernel does not increase anymore even though I see the device
>>>> model raising the IRQ and lowering it. The same tests run literally
>>>> forever in the qemu.git tree version of Qemu.
>>> What about qemu-kmv and -no-kvm-irqchip?
>> That seems to be necessary for both trees, yours and the one Avi pointed
>> me to. If applied, then I did not see the interrupt problem.
> And the fact that you were able to call qemu from my tree with
> -no-kvm-irqchip just underlines my assumption: that switch is refused by
> upstream. Please retry with the latest kvm-upstream queue.
>
> Besides that, this other bug you may see in the in-kernel IRQ path - how
> can we reproduce it?
Unfortunately I don't know. Some things have to come together for the 
code I am working on to become available and useful for everyone. It's 
going to be a while.

Thanks!
    Stefan
> Jan
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-25  3:13               ` Stefan Berger
@ 2011-01-25  7:26                 ` Jan Kiszka
  2011-01-25 16:49                   ` Stefan Berger
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kiszka @ 2011-01-25  7:26 UTC (permalink / raw)
  To: Stefan Berger; +Cc: Avi Kivity, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3841 bytes --]

On 2011-01-25 04:13, Stefan Berger wrote:
> On 01/24/2011 05:34 PM, Jan Kiszka wrote:
>> On 2011-01-24 19:27, Stefan Berger wrote:
>>> On 01/18/2011 03:53 AM, Jan Kiszka wrote:
>>>> On 2011-01-18 04:03, Stefan Berger wrote:
>>>>> On 01/16/2011 09:43 AM, Avi Kivity wrote:
>>>>>> On 01/14/2011 09:27 PM, Stefan Berger wrote:
>>>>>>>> Can you sprinkle some printfs() arount kvm_run (in qemu-kvm.c) to
>>>>>>>> verify this?
>>>>>>>>
>>>>>>> Here's what I did:
>>>>>>>
>>>>>>>
>>>>>>> interrupt exit requested
>>>>>> It appears from this you're using qemu.git.  Please try qemu-kvm.git,
>>>>>> where the code appears to be correct.
>>>>>>
>>>>> Cc'ing qemu-devel now. For reference, here the initial problem
>>>>> description:
>>>>>
>>>>> http://www.spinics.net/lists/kvm/msg48274.html
>>>>>
>>>>> I didn't know there was another tree...
>>>>>
>>>>> I have seen now a couple of suspends-while-reading with patches
>>>>> applied
>>>>> to the qemu-kvm.git tree and indeed, when run with the same host
>>>>> kernel
>>>>> and VM I do not see the debugging dumps due to double-reads that I
>>>>> would
>>>>> have anticipated seeing by now. Now what? Can this be easily fixed in
>>>>> the other Qemu tree as well?
>>>> Please give this a try:
>>>>
>>>> git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream
>>>>
>>>> I bet (&   hope) "kvm: Unconditionally reenter kernel after IO exits"
>>>> fixes the issue for you. If other problems pop up with that tree, also
>>>> try resetting to that particular commit.
>>>>
>>>> I'm currently trying to shake all those hidden or forgotten bug fixes
>>>> out of qemu-kvm and port them upstream. Most of those subtle
>>>> differences
>>>> should hopefully soon be history.
>>>>
>>> I did the same test as I did with Avi's tree and haven't seen the
>>> consequences of possible double-reads. So, I would say that you should
>>> upstream those patches...
>>>
>>> I searched for the text you mention above using 'gitk' but couldn't find
>>> a patch with that headline in your tree. There were others that seem to
>>> be related:
>>>
>>> Gleb Natapov: "do not enter vcpu again if it was stopped during IO"
>> Err, I don't think you checked out queues/kvm-upstream. I bet you just
>> ran my master branch which is a version of qemu-kvm's master. Am I
>> right? :)
>>
> 
> You're right. :-) my lack of git knowledge -  checked out the branch now.
> 
> I redid the testing and it passed. No double-reads and lost bytes from
> what I could see.

Great, thanks.

> 
>>>>> One thing I'd like to mention is that I have seen what I think are
>>>>> interrupt stalls when running my tests inside the qemu-kvm.git tree
>>>>> version and not suspending at all. A some point the interrupt
>>>>> counter in
>>>>> the guest kernel does not increase anymore even though I see the
>>>>> device
>>>>> model raising the IRQ and lowering it. The same tests run literally
>>>>> forever in the qemu.git tree version of Qemu.
>>>> What about qemu-kmv and -no-kvm-irqchip?
>>> That seems to be necessary for both trees, yours and the one Avi pointed
>>> me to. If applied, then I did not see the interrupt problem.
>> And the fact that you were able to call qemu from my tree with
>> -no-kvm-irqchip just underlines my assumption: that switch is refused by
>> upstream. Please retry with the latest kvm-upstream queue.
>>
>> Besides that, this other bug you may see in the in-kernel IRQ path - how
>> can we reproduce it?
> Unfortunately I don't know. Some things have to come together for the
> code I am working on to become available and useful for everyone. It's
> going to be a while.

Do you see a chance to look closer at the issue yourself? E.g.
instrument the kernel's irqchip models and dump their states once your
guest is stuck?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-25  7:26                 ` Jan Kiszka
@ 2011-01-25 16:49                   ` Stefan Berger
  2011-01-26  8:14                     ` Jan Kiszka
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Berger @ 2011-01-25 16:49 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm, qemu-devel

On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>
> Do you see a chance to look closer at the issue yourself? E.g.
> instrument the kernel's irqchip models and dump their states once your
> guest is stuck?
The device runs on iRQ 3. So I applied this patch here.

diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index 3cece05..8f4f94c 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state *s, int irq, int level)
  {
  	int mask, ret = 1;
  	mask = 1<<  irq;
-	if (s->elcr&  mask)	/* level triggered */
+	if (s->elcr&  mask)	/* level triggered */ {
  		if (level) {
  			ret = !(s->irr&  mask);
  			s->irr |= mask;
@@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct kvm_kpic_state *s, int irq, int level)
  			s->irr&= ~mask;
  			s->last_irr&= ~mask;
  		}
-	else	/* edge triggered */
+if (irq == 3)
+    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level, s->irr);
+        }
+	else	/* edge triggered */ {
  		if (level) {
  			if ((s->last_irr&  mask) == 0) {
  				ret = !(s->irr&  mask);
@@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state *s, int irq, int level)
  			s->last_irr |= mask;
  		} else
  			s->last_irr&= ~mask;
-
+if (irq == 3)
+    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level, s->irr);
+        }
  	return (s->imr&  mask) ? -1 : ret;
  }

@@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int level)

  	pic_lock(s);
  	if (irq>= 0&&  irq<  PIC_NUM_PINS) {
+if (irq == 3)
+printk("%s\n", __FUNCTION__);
  		ret = pic_set_irq1(&s->pics[irq>>  3], irq&  7, level);
  		pic_update_irq(s);
  		trace_kvm_pic_set_irq(irq>>  3, irq&  7, s->pics[irq>>  3].elcr,



While it's still working I see this here with the levels changing 0-1-0. 
Though then it stops and levels are only at '1'.

[ 1773.833824] kvm_pic_set_irq
[ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
[ 1773.834161] kvm_pic_set_irq
[ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
[ 1773.834193] kvm_pic_set_irq
[ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
[ 1773.835028] kvm_pic_set_irq
[ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
[ 1773.835542] kvm_pic_set_irq
[ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
[ 1773.889892] kvm_pic_set_irq
[ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
[ 1791.258793] pic_set_irq1 119: level=1, irr = d9
[ 1791.258824] pic_set_irq1 119: level=0, irr = d1
[ 1791.402476] pic_set_irq1 119: level=1, irr = d9
[ 1791.402534] pic_set_irq1 119: level=0, irr = d1
[ 1791.402538] pic_set_irq1 119: level=1, irr = d9
[...]


I believe the last 5 shown calls can be ignored. After that the 
interrupts don't go through anymore.

In the device model I see interrupts being raised and cleared. After the 
last one was cleared in 'my' device model, only interrupts are raised. 
This looks like as if the interrupt handler in the guest Linux was never 
run, thus the IRQ is never cleared and we're stuck.



Regards,
     Stefan

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-25 16:49                   ` Stefan Berger
@ 2011-01-26  8:14                     ` Jan Kiszka
  2011-01-26 12:05                       ` Stefan Berger
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kiszka @ 2011-01-26  8:14 UTC (permalink / raw)
  To: Stefan Berger; +Cc: Avi Kivity, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3746 bytes --]

On 2011-01-25 17:49, Stefan Berger wrote:
> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>
>> Do you see a chance to look closer at the issue yourself? E.g.
>> instrument the kernel's irqchip models and dump their states once your
>> guest is stuck?
> The device runs on iRQ 3. So I applied this patch here.
> 
> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
> index 3cece05..8f4f94c 100644
> --- a/arch/x86/kvm/i8259.c
> +++ b/arch/x86/kvm/i8259.c
> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state
> *s, int irq, int level)
>  {
>      int mask, ret = 1;
>      mask = 1<<  irq;
> -    if (s->elcr&  mask)    /* level triggered */
> +    if (s->elcr&  mask)    /* level triggered */ {
>          if (level) {
>              ret = !(s->irr&  mask);
>              s->irr |= mask;
> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
> kvm_kpic_state *s, int irq, int level)
>              s->irr&= ~mask;
>              s->last_irr&= ~mask;
>          }
> -    else    /* edge triggered */
> +if (irq == 3)
> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
> s->irr);
> +        }
> +    else    /* edge triggered */ {
>          if (level) {
>              if ((s->last_irr&  mask) == 0) {
>                  ret = !(s->irr&  mask);
> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state
> *s, int irq, int level)
>              s->last_irr |= mask;
>          } else
>              s->last_irr&= ~mask;
> -
> +if (irq == 3)
> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
> s->irr);
> +        }
>      return (s->imr&  mask) ? -1 : ret;
>  }
> 
> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int level)
> 
>      pic_lock(s);
>      if (irq>= 0&&  irq<  PIC_NUM_PINS) {
> +if (irq == 3)
> +printk("%s\n", __FUNCTION__);
>          ret = pic_set_irq1(&s->pics[irq>>  3], irq&  7, level);
>          pic_update_irq(s);
>          trace_kvm_pic_set_irq(irq>>  3, irq&  7, s->pics[irq>>  3].elcr,
> 
> 
> 
> While it's still working I see this here with the levels changing 0-1-0.
> Though then it stops and levels are only at '1'.
> 
> [ 1773.833824] kvm_pic_set_irq
> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
> [ 1773.834161] kvm_pic_set_irq
> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
> [ 1773.834193] kvm_pic_set_irq
> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
> [ 1773.835028] kvm_pic_set_irq
> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
> [ 1773.835542] kvm_pic_set_irq
> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
> [ 1773.889892] kvm_pic_set_irq
> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
> [...]
> 
> 
> I believe the last 5 shown calls can be ignored. After that the
> interrupts don't go through anymore.
> 
> In the device model I see interrupts being raised and cleared. After the
> last one was cleared in 'my' device model, only interrupts are raised.
> This looks like as if the interrupt handler in the guest Linux was never
> run, thus the IRQ is never cleared and we're stuck.
> 

User space is responsible for both setting and clearing that line. IRQ3
means you are using some serial device model? Then you should check what
its state is.

Moreover, a complete picture of the kernel/user space interaction should
be obtainable by using fstrace for capturing kvm events.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-26  8:14                     ` Jan Kiszka
@ 2011-01-26 12:05                       ` Stefan Berger
  2011-01-26 12:09                         ` Jan Kiszka
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Berger @ 2011-01-26 12:05 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm, qemu-devel

On 01/26/2011 03:14 AM, Jan Kiszka wrote:
> On 2011-01-25 17:49, Stefan Berger wrote:
>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>> Do you see a chance to look closer at the issue yourself? E.g.
>>> instrument the kernel's irqchip models and dump their states once your
>>> guest is stuck?
>> The device runs on iRQ 3. So I applied this patch here.
>>
>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>> index 3cece05..8f4f94c 100644
>> --- a/arch/x86/kvm/i8259.c
>> +++ b/arch/x86/kvm/i8259.c
>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>> *s, int irq, int level)
>>   {
>>       int mask, ret = 1;
>>       mask = 1<<   irq;
>> -    if (s->elcr&   mask)    /* level triggered */
>> +    if (s->elcr&   mask)    /* level triggered */ {
>>           if (level) {
>>               ret = !(s->irr&   mask);
>>               s->irr |= mask;
>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>> kvm_kpic_state *s, int irq, int level)
>>               s->irr&= ~mask;
>>               s->last_irr&= ~mask;
>>           }
>> -    else    /* edge triggered */
>> +if (irq == 3)
>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>> s->irr);
>> +        }
>> +    else    /* edge triggered */ {
>>           if (level) {
>>               if ((s->last_irr&   mask) == 0) {
>>                   ret = !(s->irr&   mask);
>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>> *s, int irq, int level)
>>               s->last_irr |= mask;
>>           } else
>>               s->last_irr&= ~mask;
>> -
>> +if (irq == 3)
>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>> s->irr);
>> +        }
>>       return (s->imr&   mask) ? -1 : ret;
>>   }
>>
>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int level)
>>
>>       pic_lock(s);
>>       if (irq>= 0&&   irq<   PIC_NUM_PINS) {
>> +if (irq == 3)
>> +printk("%s\n", __FUNCTION__);
>>           ret = pic_set_irq1(&s->pics[irq>>   3], irq&   7, level);
>>           pic_update_irq(s);
>>           trace_kvm_pic_set_irq(irq>>   3, irq&   7, s->pics[irq>>   3].elcr,
>>
>>
>>
>> While it's still working I see this here with the levels changing 0-1-0.
>> Though then it stops and levels are only at '1'.
>>
>> [ 1773.833824] kvm_pic_set_irq
>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>> [ 1773.834161] kvm_pic_set_irq
>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>> [ 1773.834193] kvm_pic_set_irq
>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>> [ 1773.835028] kvm_pic_set_irq
>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>> [ 1773.835542] kvm_pic_set_irq
>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>> [ 1773.889892] kvm_pic_set_irq
>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>> [...]
>>
>>
>> I believe the last 5 shown calls can be ignored. After that the
>> interrupts don't go through anymore.
>>
>> In the device model I see interrupts being raised and cleared. After the
>> last one was cleared in 'my' device model, only interrupts are raised.
>> This looks like as if the interrupt handler in the guest Linux was never
>> run, thus the IRQ is never cleared and we're stuck.
>>
> User space is responsible for both setting and clearing that line. IRQ3
> means you are using some serial device model? Then you should check what
> its state is.
Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git) 
from what I can see. There was no UART on IRQ3 before, though, but 
certainly it was the wrong IRQ for it.
> Moreover, a complete picture of the kernel/user space interaction should
> be obtainable by using fstrace for capturing kvm events.
>
Should it be working on IRQ3? If so, I'd look into it when I get a chance...
    Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-26 12:05                       ` Stefan Berger
@ 2011-01-26 12:09                         ` Jan Kiszka
  2011-01-26 13:08                           ` Stefan Berger
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kiszka @ 2011-01-26 12:09 UTC (permalink / raw)
  To: Stefan Berger; +Cc: Avi Kivity, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 4538 bytes --]

On 2011-01-26 13:05, Stefan Berger wrote:
> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>> On 2011-01-25 17:49, Stefan Berger wrote:
>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>> instrument the kernel's irqchip models and dump their states once your
>>>> guest is stuck?
>>> The device runs on iRQ 3. So I applied this patch here.
>>>
>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>> index 3cece05..8f4f94c 100644
>>> --- a/arch/x86/kvm/i8259.c
>>> +++ b/arch/x86/kvm/i8259.c
>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>>> *s, int irq, int level)
>>>   {
>>>       int mask, ret = 1;
>>>       mask = 1<<   irq;
>>> -    if (s->elcr&   mask)    /* level triggered */
>>> +    if (s->elcr&   mask)    /* level triggered */ {
>>>           if (level) {
>>>               ret = !(s->irr&   mask);
>>>               s->irr |= mask;
>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>> kvm_kpic_state *s, int irq, int level)
>>>               s->irr&= ~mask;
>>>               s->last_irr&= ~mask;
>>>           }
>>> -    else    /* edge triggered */
>>> +if (irq == 3)
>>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>>> s->irr);
>>> +        }
>>> +    else    /* edge triggered */ {
>>>           if (level) {
>>>               if ((s->last_irr&   mask) == 0) {
>>>                   ret = !(s->irr&   mask);
>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>>> *s, int irq, int level)
>>>               s->last_irr |= mask;
>>>           } else
>>>               s->last_irr&= ~mask;
>>> -
>>> +if (irq == 3)
>>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>>> s->irr);
>>> +        }
>>>       return (s->imr&   mask) ? -1 : ret;
>>>   }
>>>
>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>> level)
>>>
>>>       pic_lock(s);
>>>       if (irq>= 0&&   irq<   PIC_NUM_PINS) {
>>> +if (irq == 3)
>>> +printk("%s\n", __FUNCTION__);
>>>           ret = pic_set_irq1(&s->pics[irq>>   3], irq&   7, level);
>>>           pic_update_irq(s);
>>>           trace_kvm_pic_set_irq(irq>>   3, irq&   7, s->pics[irq>>  
>>> 3].elcr,
>>>
>>>
>>>
>>> While it's still working I see this here with the levels changing 0-1-0.
>>> Though then it stops and levels are only at '1'.
>>>
>>> [ 1773.833824] kvm_pic_set_irq
>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>> [ 1773.834161] kvm_pic_set_irq
>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>> [ 1773.834193] kvm_pic_set_irq
>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>> [ 1773.835028] kvm_pic_set_irq
>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>> [ 1773.835542] kvm_pic_set_irq
>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>> [ 1773.889892] kvm_pic_set_irq
>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>> [...]
>>>
>>>
>>> I believe the last 5 shown calls can be ignored. After that the
>>> interrupts don't go through anymore.
>>>
>>> In the device model I see interrupts being raised and cleared. After the
>>> last one was cleared in 'my' device model, only interrupts are raised.
>>> This looks like as if the interrupt handler in the guest Linux was never
>>> run, thus the IRQ is never cleared and we're stuck.
>>>
>> User space is responsible for both setting and clearing that line. IRQ3
>> means you are using some serial device model? Then you should check what
>> its state is.
> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
> from what I can see. There was no UART on IRQ3 before, though, but
> certainly it was the wrong IRQ for it.
>> Moreover, a complete picture of the kernel/user space interaction should
>> be obtainable by using fstrace for capturing kvm events.
>>
> Should it be working on IRQ3? If so, I'd look into it when I get a
> chance...

I don't know your customizations, so it's hard to tell if that should
work or not. IRQ3 is intended to be used by ISA devices on the PC
machine. Are you adding an ISA model, or what is your use case?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-26 12:09                         ` Jan Kiszka
@ 2011-01-26 13:08                           ` Stefan Berger
  2011-01-26 13:15                             ` Jan Kiszka
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Berger @ 2011-01-26 13:08 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm, qemu-devel

On 01/26/2011 07:09 AM, Jan Kiszka wrote:
> On 2011-01-26 13:05, Stefan Berger wrote:
>> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>>> On 2011-01-25 17:49, Stefan Berger wrote:
>>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>>> instrument the kernel's irqchip models and dump their states once your
>>>>> guest is stuck?
>>>> The device runs on iRQ 3. So I applied this patch here.
>>>>
>>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>>> index 3cece05..8f4f94c 100644
>>>> --- a/arch/x86/kvm/i8259.c
>>>> +++ b/arch/x86/kvm/i8259.c
>>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>>>> *s, int irq, int level)
>>>>    {
>>>>        int mask, ret = 1;
>>>>        mask = 1<<    irq;
>>>> -    if (s->elcr&    mask)    /* level triggered */
>>>> +    if (s->elcr&    mask)    /* level triggered */ {
>>>>            if (level) {
>>>>                ret = !(s->irr&    mask);
>>>>                s->irr |= mask;
>>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>>> kvm_kpic_state *s, int irq, int level)
>>>>                s->irr&= ~mask;
>>>>                s->last_irr&= ~mask;
>>>>            }
>>>> -    else    /* edge triggered */
>>>> +if (irq == 3)
>>>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>>>> s->irr);
>>>> +        }
>>>> +    else    /* edge triggered */ {
>>>>            if (level) {
>>>>                if ((s->last_irr&    mask) == 0) {
>>>>                    ret = !(s->irr&    mask);
>>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct kvm_kpic_state
>>>> *s, int irq, int level)
>>>>                s->last_irr |= mask;
>>>>            } else
>>>>                s->last_irr&= ~mask;
>>>> -
>>>> +if (irq == 3)
>>>> +    printk("%s %d: level=%d, irr = %x\n", __FUNCTION__,__LINE__,level,
>>>> s->irr);
>>>> +        }
>>>>        return (s->imr&    mask) ? -1 : ret;
>>>>    }
>>>>
>>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>>> level)
>>>>
>>>>        pic_lock(s);
>>>>        if (irq>= 0&&    irq<    PIC_NUM_PINS) {
>>>> +if (irq == 3)
>>>> +printk("%s\n", __FUNCTION__);
>>>>            ret = pic_set_irq1(&s->pics[irq>>    3], irq&    7, level);
>>>>            pic_update_irq(s);
>>>>            trace_kvm_pic_set_irq(irq>>    3, irq&    7, s->pics[irq>>
>>>> 3].elcr,
>>>>
>>>>
>>>>
>>>> While it's still working I see this here with the levels changing 0-1-0.
>>>> Though then it stops and levels are only at '1'.
>>>>
>>>> [ 1773.833824] kvm_pic_set_irq
>>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>>> [ 1773.834161] kvm_pic_set_irq
>>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>>> [ 1773.834193] kvm_pic_set_irq
>>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>>> [ 1773.835028] kvm_pic_set_irq
>>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>>> [ 1773.835542] kvm_pic_set_irq
>>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>>> [ 1773.889892] kvm_pic_set_irq
>>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>>> [...]
>>>>
>>>>
>>>> I believe the last 5 shown calls can be ignored. After that the
>>>> interrupts don't go through anymore.
>>>>
>>>> In the device model I see interrupts being raised and cleared. After the
>>>> last one was cleared in 'my' device model, only interrupts are raised.
>>>> This looks like as if the interrupt handler in the guest Linux was never
>>>> run, thus the IRQ is never cleared and we're stuck.
>>>>
>>> User space is responsible for both setting and clearing that line. IRQ3
>>> means you are using some serial device model? Then you should check what
>>> its state is.
>> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
>> from what I can see. There was no UART on IRQ3 before, though, but
>> certainly it was the wrong IRQ for it.
>>> Moreover, a complete picture of the kernel/user space interaction should
>>> be obtainable by using fstrace for capturing kvm events.
>>>
>> Should it be working on IRQ3? If so, I'd look into it when I get a
>> chance...
> I don't know your customizations, so it's hard to tell if that should
> work or not. IRQ3 is intended to be used by ISA devices on the PC
> machine. Are you adding an ISA model, or what is your use case?
>
The use case is to add a TPM device interface.

http://xenbits.xensource.com/xen-unstable.hg?file/1e56ac73b9b9/tools/ioemu/hw/tpm_tis.c

This one typically is connected to the LPC bus.

    Stefan

> Jan
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-26 13:08                           ` Stefan Berger
@ 2011-01-26 13:15                             ` Jan Kiszka
  2011-01-26 13:31                               ` Jan Kiszka
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kiszka @ 2011-01-26 13:15 UTC (permalink / raw)
  To: Stefan Berger; +Cc: Avi Kivity, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 5275 bytes --]

On 2011-01-26 14:08, Stefan Berger wrote:
> On 01/26/2011 07:09 AM, Jan Kiszka wrote:
>> On 2011-01-26 13:05, Stefan Berger wrote:
>>> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>>>> On 2011-01-25 17:49, Stefan Berger wrote:
>>>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>>>> instrument the kernel's irqchip models and dump their states once
>>>>>> your
>>>>>> guest is stuck?
>>>>> The device runs on iRQ 3. So I applied this patch here.
>>>>>
>>>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>>>> index 3cece05..8f4f94c 100644
>>>>> --- a/arch/x86/kvm/i8259.c
>>>>> +++ b/arch/x86/kvm/i8259.c
>>>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct
>>>>> kvm_kpic_state
>>>>> *s, int irq, int level)
>>>>>    {
>>>>>        int mask, ret = 1;
>>>>>        mask = 1<<    irq;
>>>>> -    if (s->elcr&    mask)    /* level triggered */
>>>>> +    if (s->elcr&    mask)    /* level triggered */ {
>>>>>            if (level) {
>>>>>                ret = !(s->irr&    mask);
>>>>>                s->irr |= mask;
>>>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>>>> kvm_kpic_state *s, int irq, int level)
>>>>>                s->irr&= ~mask;
>>>>>                s->last_irr&= ~mask;
>>>>>            }
>>>>> -    else    /* edge triggered */
>>>>> +if (irq == 3)
>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>> __FUNCTION__,__LINE__,level,
>>>>> s->irr);
>>>>> +        }
>>>>> +    else    /* edge triggered */ {
>>>>>            if (level) {
>>>>>                if ((s->last_irr&    mask) == 0) {
>>>>>                    ret = !(s->irr&    mask);
>>>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct
>>>>> kvm_kpic_state
>>>>> *s, int irq, int level)
>>>>>                s->last_irr |= mask;
>>>>>            } else
>>>>>                s->last_irr&= ~mask;
>>>>> -
>>>>> +if (irq == 3)
>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>> __FUNCTION__,__LINE__,level,
>>>>> s->irr);
>>>>> +        }
>>>>>        return (s->imr&    mask) ? -1 : ret;
>>>>>    }
>>>>>
>>>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>>>> level)
>>>>>
>>>>>        pic_lock(s);
>>>>>        if (irq>= 0&&    irq<    PIC_NUM_PINS) {
>>>>> +if (irq == 3)
>>>>> +printk("%s\n", __FUNCTION__);
>>>>>            ret = pic_set_irq1(&s->pics[irq>>    3], irq&    7, level);
>>>>>            pic_update_irq(s);
>>>>>            trace_kvm_pic_set_irq(irq>>    3, irq&    7, s->pics[irq>>
>>>>> 3].elcr,
>>>>>
>>>>>
>>>>>
>>>>> While it's still working I see this here with the levels changing
>>>>> 0-1-0.
>>>>> Though then it stops and levels are only at '1'.
>>>>>
>>>>> [ 1773.833824] kvm_pic_set_irq
>>>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>>>> [ 1773.834161] kvm_pic_set_irq
>>>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>>>> [ 1773.834193] kvm_pic_set_irq
>>>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>>>> [ 1773.835028] kvm_pic_set_irq
>>>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>>>> [ 1773.835542] kvm_pic_set_irq
>>>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>>>> [ 1773.889892] kvm_pic_set_irq
>>>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>>>> [...]
>>>>>
>>>>>
>>>>> I believe the last 5 shown calls can be ignored. After that the
>>>>> interrupts don't go through anymore.
>>>>>
>>>>> In the device model I see interrupts being raised and cleared.
>>>>> After the
>>>>> last one was cleared in 'my' device model, only interrupts are raised.
>>>>> This looks like as if the interrupt handler in the guest Linux was
>>>>> never
>>>>> run, thus the IRQ is never cleared and we're stuck.
>>>>>
>>>> User space is responsible for both setting and clearing that line. IRQ3
>>>> means you are using some serial device model? Then you should check
>>>> what
>>>> its state is.
>>> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
>>> from what I can see. There was no UART on IRQ3 before, though, but
>>> certainly it was the wrong IRQ for it.
>>>> Moreover, a complete picture of the kernel/user space interaction
>>>> should
>>>> be obtainable by using fstrace for capturing kvm events.
>>>>
>>> Should it be working on IRQ3? If so, I'd look into it when I get a
>>> chance...
>> I don't know your customizations, so it's hard to tell if that should
>> work or not. IRQ3 is intended to be used by ISA devices on the PC
>> machine. Are you adding an ISA model, or what is your use case?
>>
> The use case is to add a TPM device interface.
> 
> http://xenbits.xensource.com/xen-unstable.hg?file/1e56ac73b9b9/tools/ioemu/hw/tpm_tis.c
> 
> 
> This one typically is connected to the LPC bus.

I see. Do you also have the xen-free version of it? Maybe there are
still issues with proper qdev integration etc.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-26 13:15                             ` Jan Kiszka
@ 2011-01-26 13:31                               ` Jan Kiszka
  2011-01-26 13:52                                 ` Stefan Berger
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kiszka @ 2011-01-26 13:31 UTC (permalink / raw)
  To: Stefan Berger; +Cc: Avi Kivity, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 6159 bytes --]

On 2011-01-26 14:15, Jan Kiszka wrote:
> On 2011-01-26 14:08, Stefan Berger wrote:
>> On 01/26/2011 07:09 AM, Jan Kiszka wrote:
>>> On 2011-01-26 13:05, Stefan Berger wrote:
>>>> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>>>>> On 2011-01-25 17:49, Stefan Berger wrote:
>>>>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>>>>> instrument the kernel's irqchip models and dump their states once
>>>>>>> your
>>>>>>> guest is stuck?
>>>>>> The device runs on iRQ 3. So I applied this patch here.
>>>>>>
>>>>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>>>>> index 3cece05..8f4f94c 100644
>>>>>> --- a/arch/x86/kvm/i8259.c
>>>>>> +++ b/arch/x86/kvm/i8259.c
>>>>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct
>>>>>> kvm_kpic_state
>>>>>> *s, int irq, int level)
>>>>>>    {
>>>>>>        int mask, ret = 1;
>>>>>>        mask = 1<<    irq;
>>>>>> -    if (s->elcr&    mask)    /* level triggered */
>>>>>> +    if (s->elcr&    mask)    /* level triggered */ {
>>>>>>            if (level) {
>>>>>>                ret = !(s->irr&    mask);
>>>>>>                s->irr |= mask;
>>>>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>>>>> kvm_kpic_state *s, int irq, int level)
>>>>>>                s->irr&= ~mask;
>>>>>>                s->last_irr&= ~mask;
>>>>>>            }
>>>>>> -    else    /* edge triggered */
>>>>>> +if (irq == 3)
>>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>>> __FUNCTION__,__LINE__,level,
>>>>>> s->irr);
>>>>>> +        }
>>>>>> +    else    /* edge triggered */ {
>>>>>>            if (level) {
>>>>>>                if ((s->last_irr&    mask) == 0) {
>>>>>>                    ret = !(s->irr&    mask);
>>>>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct
>>>>>> kvm_kpic_state
>>>>>> *s, int irq, int level)
>>>>>>                s->last_irr |= mask;
>>>>>>            } else
>>>>>>                s->last_irr&= ~mask;
>>>>>> -
>>>>>> +if (irq == 3)
>>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>>> __FUNCTION__,__LINE__,level,
>>>>>> s->irr);
>>>>>> +        }
>>>>>>        return (s->imr&    mask) ? -1 : ret;
>>>>>>    }
>>>>>>
>>>>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>>>>> level)
>>>>>>
>>>>>>        pic_lock(s);
>>>>>>        if (irq>= 0&&    irq<    PIC_NUM_PINS) {
>>>>>> +if (irq == 3)
>>>>>> +printk("%s\n", __FUNCTION__);
>>>>>>            ret = pic_set_irq1(&s->pics[irq>>    3], irq&    7, level);
>>>>>>            pic_update_irq(s);
>>>>>>            trace_kvm_pic_set_irq(irq>>    3, irq&    7, s->pics[irq>>
>>>>>> 3].elcr,
>>>>>>
>>>>>>
>>>>>>
>>>>>> While it's still working I see this here with the levels changing
>>>>>> 0-1-0.
>>>>>> Though then it stops and levels are only at '1'.
>>>>>>
>>>>>> [ 1773.833824] kvm_pic_set_irq
>>>>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>>>>> [ 1773.834161] kvm_pic_set_irq
>>>>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1773.834193] kvm_pic_set_irq
>>>>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>>>>> [ 1773.835028] kvm_pic_set_irq
>>>>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1773.835542] kvm_pic_set_irq
>>>>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1773.889892] kvm_pic_set_irq
>>>>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>>>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>>>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>>>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>>>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>>>>> [...]
>>>>>>
>>>>>>
>>>>>> I believe the last 5 shown calls can be ignored. After that the
>>>>>> interrupts don't go through anymore.
>>>>>>
>>>>>> In the device model I see interrupts being raised and cleared.
>>>>>> After the
>>>>>> last one was cleared in 'my' device model, only interrupts are raised.
>>>>>> This looks like as if the interrupt handler in the guest Linux was
>>>>>> never
>>>>>> run, thus the IRQ is never cleared and we're stuck.
>>>>>>
>>>>> User space is responsible for both setting and clearing that line. IRQ3
>>>>> means you are using some serial device model? Then you should check
>>>>> what
>>>>> its state is.
>>>> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
>>>> from what I can see. There was no UART on IRQ3 before, though, but
>>>> certainly it was the wrong IRQ for it.
>>>>> Moreover, a complete picture of the kernel/user space interaction
>>>>> should
>>>>> be obtainable by using fstrace for capturing kvm events.
>>>>>
>>>> Should it be working on IRQ3? If so, I'd look into it when I get a
>>>> chance...
>>> I don't know your customizations, so it's hard to tell if that should
>>> work or not. IRQ3 is intended to be used by ISA devices on the PC
>>> machine. Are you adding an ISA model, or what is your use case?
>>>
>> The use case is to add a TPM device interface.
>>
>> http://xenbits.xensource.com/xen-unstable.hg?file/1e56ac73b9b9/tools/ioemu/hw/tpm_tis.c
>>
>>
>> This one typically is connected to the LPC bus.
> 
> I see. Do you also have the xen-free version of it? Maybe there are
> still issues with proper qdev integration etc.
> 

Without knowing the hardware spec or what is actually behind set_irq,
this looks at least suspicious:

[...]
if (off == TPM_REG_INT_STATUS) {
    /* clearing of interrupt flags */
    if ((val & INTERRUPTS_SUPPORTED) &&
        (s->loc[locty].ints & INTERRUPTS_SUPPORTED)) {
        s->set_irq(s->irq_opaque, s->irq, 0);
        s->irq_pending = 0;
    }
    s->loc[locty].ints &= ~(val & INTERRUPTS_SUPPORTED);
} else
[...]

The code does no
t check if there are ints left after masking out those provided in val.
Does that device already de-asserts the line if you only clear a single
interrupt reason?

BTW, irq_pending looks redundant, at least when using the qemu irq
subsystem.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
  2011-01-26 13:31                               ` Jan Kiszka
@ 2011-01-26 13:52                                 ` Stefan Berger
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Berger @ 2011-01-26 13:52 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm, qemu-devel

On 01/26/2011 08:31 AM, Jan Kiszka wrote:
> On 2011-01-26 14:15, Jan Kiszka wrote:
>> On 2011-01-26 14:08, Stefan Berger wrote:
>>> On 01/26/2011 07:09 AM, Jan Kiszka wrote:
>>>> On 2011-01-26 13:05, Stefan Berger wrote:
>>>>> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>>>>>> On 2011-01-25 17:49, Stefan Berger wrote:
>>>>>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>>>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>>>>>> instrument the kernel's irqchip models and dump their states once
>>>>>>>> your
>>>>>>>> guest is stuck?
>>>>>>> The device runs on iRQ 3. So I applied this patch here.
>>>>>>>
>>>>>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>>>>>> index 3cece05..8f4f94c 100644
>>>>>>> --- a/arch/x86/kvm/i8259.c
>>>>>>> +++ b/arch/x86/kvm/i8259.c
>>>>>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct
>>>>>>> kvm_kpic_state
>>>>>>> *s, int irq, int level)
>>>>>>>     {
>>>>>>>         int mask, ret = 1;
>>>>>>>         mask = 1<<     irq;
>>>>>>> -    if (s->elcr&     mask)    /* level triggered */
>>>>>>> +    if (s->elcr&     mask)    /* level triggered */ {
>>>>>>>             if (level) {
>>>>>>>                 ret = !(s->irr&     mask);
>>>>>>>                 s->irr |= mask;
>>>>>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>>>>>> kvm_kpic_state *s, int irq, int level)
>>>>>>>                 s->irr&= ~mask;
>>>>>>>                 s->last_irr&= ~mask;
>>>>>>>             }
>>>>>>> -    else    /* edge triggered */
>>>>>>> +if (irq == 3)
>>>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>>>> __FUNCTION__,__LINE__,level,
>>>>>>> s->irr);
>>>>>>> +        }
>>>>>>> +    else    /* edge triggered */ {
>>>>>>>             if (level) {
>>>>>>>                 if ((s->last_irr&     mask) == 0) {
>>>>>>>                     ret = !(s->irr&     mask);
>>>>>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct
>>>>>>> kvm_kpic_state
>>>>>>> *s, int irq, int level)
>>>>>>>                 s->last_irr |= mask;
>>>>>>>             } else
>>>>>>>                 s->last_irr&= ~mask;
>>>>>>> -
>>>>>>> +if (irq == 3)
>>>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>>>> __FUNCTION__,__LINE__,level,
>>>>>>> s->irr);
>>>>>>> +        }
>>>>>>>         return (s->imr&     mask) ? -1 : ret;
>>>>>>>     }
>>>>>>>
>>>>>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>>>>>> level)
>>>>>>>
>>>>>>>         pic_lock(s);
>>>>>>>         if (irq>= 0&&     irq<     PIC_NUM_PINS) {
>>>>>>> +if (irq == 3)
>>>>>>> +printk("%s\n", __FUNCTION__);
>>>>>>>             ret = pic_set_irq1(&s->pics[irq>>     3], irq&     7, level);
>>>>>>>             pic_update_irq(s);
>>>>>>>             trace_kvm_pic_set_irq(irq>>     3, irq&     7, s->pics[irq>>
>>>>>>> 3].elcr,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> While it's still working I see this here with the levels changing
>>>>>>> 0-1-0.
>>>>>>> Though then it stops and levels are only at '1'.
>>>>>>>
>>>>>>> [ 1773.833824] kvm_pic_set_irq
>>>>>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>>>>>> [ 1773.834161] kvm_pic_set_irq
>>>>>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>>>>>> [ 1773.834193] kvm_pic_set_irq
>>>>>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>>>>>> [ 1773.835028] kvm_pic_set_irq
>>>>>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>>>>>> [ 1773.835542] kvm_pic_set_irq
>>>>>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>>>>>> [ 1773.889892] kvm_pic_set_irq
>>>>>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>>>>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>>>>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>>>>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>>>>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>>>>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>>>>>> [...]
>>>>>>>
>>>>>>>
>>>>>>> I believe the last 5 shown calls can be ignored. After that the
>>>>>>> interrupts don't go through anymore.
>>>>>>>
>>>>>>> In the device model I see interrupts being raised and cleared.
>>>>>>> After the
>>>>>>> last one was cleared in 'my' device model, only interrupts are raised.
>>>>>>> This looks like as if the interrupt handler in the guest Linux was
>>>>>>> never
>>>>>>> run, thus the IRQ is never cleared and we're stuck.
>>>>>>>
>>>>>> User space is responsible for both setting and clearing that line. IRQ3
>>>>>> means you are using some serial device model? Then you should check
>>>>>> what
>>>>>> its state is.
>>>>> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
>>>>> from what I can see. There was no UART on IRQ3 before, though, but
>>>>> certainly it was the wrong IRQ for it.
>>>>>> Moreover, a complete picture of the kernel/user space interaction
>>>>>> should
>>>>>> be obtainable by using fstrace for capturing kvm events.
>>>>>>
>>>>> Should it be working on IRQ3? If so, I'd look into it when I get a
>>>>> chance...
>>>> I don't know your customizations, so it's hard to tell if that should
>>>> work or not. IRQ3 is intended to be used by ISA devices on the PC
>>>> machine. Are you adding an ISA model, or what is your use case?
>>>>
>>> The use case is to add a TPM device interface.
>>>
>>> http://xenbits.xensource.com/xen-unstable.hg?file/1e56ac73b9b9/tools/ioemu/hw/tpm_tis.c
>>>
>>>
>>> This one typically is connected to the LPC bus.
>> I see. Do you also have the xen-free version of it? Maybe there are
>> still issues with proper qdev integration etc.
>>
> Without knowing the hardware spec or what is actually behind set_irq,
> this looks at least suspicious:
>
> [...]
> if (off == TPM_REG_INT_STATUS) {
>      /* clearing of interrupt flags */
>      if ((val&  INTERRUPTS_SUPPORTED)&&
>          (s->loc[locty].ints&  INTERRUPTS_SUPPORTED)) {
>          s->set_irq(s->irq_opaque, s->irq, 0);
>          s->irq_pending = 0;
>      }
>      s->loc[locty].ints&= ~(val&  INTERRUPTS_SUPPORTED);
> } else
> [...]
>
> The code does no
> t check if there are ints left after masking out those provided in val.
> Does that device already de-asserts the line if you only clear a single
> interrupt reason?
>
> BTW, irq_pending looks redundant, at least when using the qemu irq
> subsystem.
The code has substantially changed in the meantime -- the Xen repository 
code is from > 3 years ago - I had to go backwards in the xen unstable 
repository to find it. The link was merely meant to show what type of 
device is being added.  As said, some other things need to come together 
first before this will become available.

    Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-01-26 13:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <4D2C8305.2090609@linux.vnet.ibm.com>
     [not found] ` <4D2ED260.4010801@redhat.com>
     [not found]   ` <4D30A38F.3030002@linux.vnet.ibm.com>
     [not found]     ` <4D3303FD.8020509@redhat.com>
2011-01-18  3:03       ` [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations Stefan Berger
2011-01-18  8:53         ` Jan Kiszka
2011-01-24 18:27           ` Stefan Berger
2011-01-24 22:34             ` Jan Kiszka
2011-01-25  3:13               ` Stefan Berger
2011-01-25  7:26                 ` Jan Kiszka
2011-01-25 16:49                   ` Stefan Berger
2011-01-26  8:14                     ` Jan Kiszka
2011-01-26 12:05                       ` Stefan Berger
2011-01-26 12:09                         ` Jan Kiszka
2011-01-26 13:08                           ` Stefan Berger
2011-01-26 13:15                             ` Jan Kiszka
2011-01-26 13:31                               ` Jan Kiszka
2011-01-26 13:52                                 ` Stefan Berger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).