qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* VM crashed while hot-plugging memory
@ 2023-02-10  9:30 Yangming via
  2023-02-23 19:32 ` Igor Mammedov
  0 siblings, 1 reply; 3+ messages in thread
From: Yangming via @ 2023-02-10  9:30 UTC (permalink / raw)
  To: qemu-devel@nongnu.org, mst@redhat.com, imammedo@redhat.com,
	ani@anisinha.ca
  Cc: wangzhigang (O), zhangliang (AG)

[-- Attachment #1: Type: text/plain, Size: 948 bytes --]

Hello all:

I found VM crashed while hot-plugging memory.

Base infomation:
qemu version: qemu-master
requirements: hugepages, virtio-gpu

It happens by the following steps:
1. Booting a VM with hugepages and a virtio-gpu device.
2. Connecting VNC of the VM.
3. After the VM booted, hot-plugging 512G memory.
4. Then you can find that the image in vnc is blocked and the worse thing is that the VM crashed.

Actually the vcpu is blocked because of dead lock.

Analysis:
As when hot-pluging the BQL is held, at the meanwhile, virtio-gpu is trying to hold the BQL for writing date. Then a vcpu is blocked waiting for hugepages hot-plugging, specifically, waiting for touching pages. If the blocked vcpu stops for several seconds, the soft lockup will happen, if it stops for a long time, e.g. 30s, the VM will crash.

I am wandering if there are some ideas to avoid VM soft lockup and even VM crash ?

Thank you!
kind regards!

[-- Attachment #2: Type: text/html, Size: 6252 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: VM crashed while hot-plugging memory
  2023-02-10  9:30 VM crashed while hot-plugging memory Yangming via
@ 2023-02-23 19:32 ` Igor Mammedov
  2023-02-27 15:49   ` David Hildenbrand
  0 siblings, 1 reply; 3+ messages in thread
From: Igor Mammedov @ 2023-02-23 19:32 UTC (permalink / raw)
  To: Yangming via
  Cc: Yangming, mst@redhat.com, ani@anisinha.ca, wangzhigang (O),
	zhangliang (AG), David Hildenbrand

On Fri, 10 Feb 2023 09:30:18 +0000
Yangming via <qemu-devel@nongnu.org> wrote:

> Hello all:
> 
> I found VM crashed while hot-plugging memory.
> 
> Base infomation:
> qemu version: qemu-master
> requirements: hugepages, virtio-gpu
> 
> It happens by the following steps:
> 1. Booting a VM with hugepages and a virtio-gpu device.
> 2. Connecting VNC of the VM.
> 3. After the VM booted, hot-plugging 512G memory.
> 4. Then you can find that the image in vnc is blocked and the worse thing is that the VM crashed.
> 
> Actually the vcpu is blocked because of dead lock.
> 
> Analysis:
> As when hot-pluging the BQL is held, at the meanwhile, virtio-gpu is trying to hold the BQL for writing date. Then a vcpu is blocked waiting for hugepages hot-plugging, specifically, waiting for touching pages. If the blocked vcpu stops for several seconds, the soft lockup will happen, if it stops for a long time, e.g. 30s, the VM will crash.
> 
> I am wandering if there are some ideas to avoid VM soft lockup and even VM crash ?

Maybe David can suggest something
(CCed)

> 
> Thank you!
> kind regards!



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: VM crashed while hot-plugging memory
  2023-02-23 19:32 ` Igor Mammedov
@ 2023-02-27 15:49   ` David Hildenbrand
  0 siblings, 0 replies; 3+ messages in thread
From: David Hildenbrand @ 2023-02-27 15:49 UTC (permalink / raw)
  To: Igor Mammedov, Yangming via
  Cc: Yangming, mst@redhat.com, ani@anisinha.ca, wangzhigang (O),
	zhangliang (AG)

On 23.02.23 20:32, Igor Mammedov wrote:
> On Fri, 10 Feb 2023 09:30:18 +0000
> Yangming via <qemu-devel@nongnu.org> wrote:
> 
>> Hello all:
>>
>> I found VM crashed while hot-plugging memory.
>>
>> Base infomation:
>> qemu version: qemu-master
>> requirements: hugepages, virtio-gpu
>>
>> It happens by the following steps:
>> 1. Booting a VM with hugepages and a virtio-gpu device.
>> 2. Connecting VNC of the VM.
>> 3. After the VM booted, hot-plugging 512G memory.
>> 4. Then you can find that the image in vnc is blocked and the worse thing is that the VM crashed.
>>
>> Actually the vcpu is blocked because of dead lock.
>>
>> Analysis:
>> As when hot-pluging the BQL is held, at the meanwhile, virtio-gpu is trying to hold the BQL for writing date. Then a vcpu is blocked waiting for hugepages hot-plugging, specifically, waiting for touching pages. If the blocked vcpu stops for several seconds, the soft lockup will happen, if it stops for a long time, e.g. 30s, the VM will crash.
>>
>> I am wandering if there are some ideas to avoid VM soft lockup and even VM crash ?
> 
> Maybe David can suggest something
> (CCed)

Using hugepages usually requires memory preallocation. That 
preallocation is expensive and can take quite some time, and all 
hotplugging operations happen under the BQL.

Things that could improve the situation without modifications:

(a) Disable memory preallocation (prealloc=off on the memory backend).
     But that means that if you run out of huge pages, that your VM may
     crash.

(b) Use a file on a hugetlb mount, and preallocate the memory
     externally, outside of QEMU, before plugging creating the memory
     backend and plugging the DIMM. As all memory is already
     preallocated, plugging the DIMM should be fast.

(c) Use multiple, smaller DIMMs.

(d) Parallel preallocation, using multiple preallocation threads.

(e) Use virtio-mem instead of DIMMs, which will add the memory
     incrementally in smaller steps (e.g., 128MiB -- 2 GiB). But it is
     not supported by all guests (especially not under Windows yet).


There are some upstream ideas on how to do preallcoation with hugetlb 
faster, especially, having a pool of pre-zero'ed huge pages in the 
kernel, such that allocation of a huge page gets significantly faster -- 
not upstream.

Further, there was the idea of asynchronous preallocation in QEMU. That 
could help when first creating the memory backend and waiting until it 
was asynchronously preallocation. Then, one could plug the DIMM.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-02-27 15:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-10  9:30 VM crashed while hot-plugging memory Yangming via
2023-02-23 19:32 ` Igor Mammedov
2023-02-27 15:49   ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).