public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Problems on AMD laptops
@ 2009-06-29 13:41 Avi Kivity
  2009-06-29 14:39 ` Joerg Roedel
  0 siblings, 1 reply; 4+ messages in thread
From: Avi Kivity @ 2009-06-29 13:41 UTC (permalink / raw)
  To: KVM list

kerneloops.org shows tons of oopses on amd, see 
http://www.kerneloops.org/oops.php?number=79008.  I suspect this has to 
do with resuming a laptop while a guest is running.  Can anyone confirm 
or deny?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Problems on AMD laptops
  2009-06-29 13:41 Problems on AMD laptops Avi Kivity
@ 2009-06-29 14:39 ` Joerg Roedel
  2009-06-29 14:54   ` Avi Kivity
  0 siblings, 1 reply; 4+ messages in thread
From: Joerg Roedel @ 2009-06-29 14:39 UTC (permalink / raw)
  To: Avi Kivity; +Cc: KVM list

On Mon, Jun 29, 2009 at 04:41:17PM +0300, Avi Kivity wrote:
> kerneloops.org shows tons of oopses on amd, see  
> http://www.kerneloops.org/oops.php?number=79008.  I suspect this has to  
> do with resuming a laptop while a guest is running.  Can anyone confirm  
> or deny?

I havn't verified this yet but it may have to do with dirty caches that
are not written back to memory in suspend-to-ram and are thus lost. The
resume code-path looks otherwise sane to me. The only thing I can
imagine is that a bit in the cpu_hardware_enabled cpumask is wrong after
resume.
Btw. it is guaranteed that with cpu-hotplug the cpu isn't already
executing processes when the CPU_ONLINE event call chain is called?
At least the CPU is marked online and active at that point in time.

	Joerg

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Problems on AMD laptops
  2009-06-29 14:39 ` Joerg Roedel
@ 2009-06-29 14:54   ` Avi Kivity
  2009-06-29 18:26     ` Joerg Roedel
  0 siblings, 1 reply; 4+ messages in thread
From: Avi Kivity @ 2009-06-29 14:54 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: KVM list

On 06/29/2009 05:39 PM, Joerg Roedel wrote:
> On Mon, Jun 29, 2009 at 04:41:17PM +0300, Avi Kivity wrote:
>    
>> kerneloops.org shows tons of oopses on amd, see
>> http://www.kerneloops.org/oops.php?number=79008.  I suspect this has to
>> do with resuming a laptop while a guest is running.  Can anyone confirm
>> or deny?
>>      
>
> I havn't verified this yet but it may have to do with dirty caches that
> are not written back to memory in suspend-to-ram and are thus lost.

Wouldn't that kill resume generally, not just kvm on amd?

>   The
> resume code-path looks otherwise sane to me. The only thing I can
> imagine is that a bit in the cpu_hardware_enabled cpumask is wrong after
> resume.
>    

I saw some of these oopses on cpu 0, which had better be plugged in.

> Btw. it is guaranteed that with cpu-hotplug the cpu isn't already
> executing processes when the CPU_ONLINE event call chain is called?
> At least the CPU is marked online and active at that point in time.
>    

Yes:

static struct notifier_block kvm_cpu_notifier = {
     .notifier_call = kvm_cpu_hotplug,
     .priority = 20, /* must be > scheduler priority */
};

One thing I think is missing is a call to svm_cpu_init() on real hotplug.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Problems on AMD laptops
  2009-06-29 14:54   ` Avi Kivity
@ 2009-06-29 18:26     ` Joerg Roedel
  0 siblings, 0 replies; 4+ messages in thread
From: Joerg Roedel @ 2009-06-29 18:26 UTC (permalink / raw)
  To: Avi Kivity; +Cc: KVM list

On Mon, Jun 29, 2009 at 05:54:46PM +0300, Avi Kivity wrote:
> On 06/29/2009 05:39 PM, Joerg Roedel wrote:
>> On Mon, Jun 29, 2009 at 04:41:17PM +0300, Avi Kivity wrote:
>>    
>>> kerneloops.org shows tons of oopses on amd, see
>>> http://www.kerneloops.org/oops.php?number=79008.  I suspect this has to
>>> do with resuming a laptop while a guest is running.  Can anyone confirm
>>> or deny?
>>>      
>>
>> I havn't verified this yet but it may have to do with dirty caches that
>> are not written back to memory in suspend-to-ram and are thus lost.
>
> Wouldn't that kill resume generally, not just kvm on amd?

Its a race condition which may be more likely on one hardware than on
another. I remember similar bugs fixed by Mark in the past.

>>   The
>> resume code-path looks otherwise sane to me. The only thing I can
>> imagine is that a bit in the cpu_hardware_enabled cpumask is wrong after
>> resume.
>>    
>
> I saw some of these oopses on cpu 0, which had better be plugged in.

Yeah, but if this bit is set to 0 on suspend and this change does not
make it from cache to main memory it can still be 1 on resume. And
virtualization hardware will not be re-enabled then.
Anyway, this was only a guess from me. I think we should reproduce this
oops and find out what is really going on.

>> Btw. it is guaranteed that with cpu-hotplug the cpu isn't already
>> executing processes when the CPU_ONLINE event call chain is called?
>> At least the CPU is marked online and active at that point in time.
>>    
>
> Yes:
>
> static struct notifier_block kvm_cpu_notifier = {
>     .notifier_call = kvm_cpu_hotplug,
>     .priority = 20, /* must be > scheduler priority */
> };

Ok.

> One thing I think is missing is a call to svm_cpu_init() on real hotplug.

Yes, true. There is no svm_data allocated for cpus not online on module
load.

	Joerg


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-06-29 18:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-29 13:41 Problems on AMD laptops Avi Kivity
2009-06-29 14:39 ` Joerg Roedel
2009-06-29 14:54   ` Avi Kivity
2009-06-29 18:26     ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox