From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 237BBC636CD for ; Fri, 10 Feb 2023 09:30:53 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pQPjt-0005SY-IW; Fri, 10 Feb 2023 04:30:42 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pQPjq-0005S6-2A for qemu-devel@nongnu.org; Fri, 10 Feb 2023 04:30:39 -0500 Received: from szxga08-in.huawei.com ([45.249.212.255]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pQPjn-0000ye-0D for qemu-devel@nongnu.org; Fri, 10 Feb 2023 04:30:37 -0500 Received: from dggpemm500001.china.huawei.com (unknown [172.30.72.54]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4PCpLY02C0z16Lt4; Fri, 10 Feb 2023 17:28:05 +0800 (CST) Received: from dggpemm500010.china.huawei.com (7.185.36.134) by dggpemm500001.china.huawei.com (7.185.36.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Fri, 10 Feb 2023 17:30:18 +0800 Received: from dggpemm500010.china.huawei.com ([7.185.36.134]) by dggpemm500010.china.huawei.com ([7.185.36.134]) with mapi id 15.01.2375.034; Fri, 10 Feb 2023 17:30:18 +0800 To: "qemu-devel@nongnu.org" , "mst@redhat.com" , "imammedo@redhat.com" , "ani@anisinha.ca" CC: "wangzhigang (O)" , "zhangliang (AG)" Subject: VM crashed while hot-plugging memory Thread-Topic: VM crashed while hot-plugging memory Thread-Index: Adk9MaURjdan2NllRa6742+cZL6pyA== Date: Fri, 10 Feb 2023 09:30:18 +0000 Message-ID: Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.174.184.140] Content-Type: multipart/alternative; boundary="_000_d9e62d4914a24b63af9f94a0e99b32c9huaweicom_" MIME-Version: 1.0 X-CFilter-Loop: Reflected Received-SPF: pass client-ip=45.249.212.255; envelope-from=yangming73@huawei.com; helo=szxga08-in.huawei.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: Yangming From: Yangming via Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org --_000_d9e62d4914a24b63af9f94a0e99b32c9huaweicom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hello all: I found VM crashed while hot-plugging memory. Base infomation: qemu version: qemu-master requirements: hugepages, virtio-gpu It happens by the following steps: 1. Booting a VM with hugepages and a virtio-gpu device. 2. Connecting VNC of the VM. 3. After the VM booted, hot-plugging 512G memory. 4. Then you can find that the image in vnc is blocked and the worse thing i= s that the VM crashed. Actually the vcpu is blocked because of dead lock. Analysis: As when hot-pluging the BQL is held, at the meanwhile, virtio-gpu is trying= to hold the BQL for writing date. Then a vcpu is blocked waiting for hugep= ages hot-plugging, specifically, waiting for touching pages. If the blocked= vcpu stops for several seconds, the soft lockup will happen, if it stops f= or a long time, e.g. 30s, the VM will crash. I am wandering if there are some ideas to avoid VM soft lockup and even VM = crash ? Thank you! kind regards! --_000_d9e62d4914a24b63af9f94a0e99b32c9huaweicom_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hello all:

 

I found VM crashed while hot-pl= ugging memory.

 

Base infomation:

qemu version: qemu-master<= /o:p>

requirements: hugepages, virtio= -gpu

 

It happens by the following ste= ps:

1. Booting a VM with hugepages = and a virtio-gpu device.

2. Connecting VNC of the VM.

3. After the VM booted, hot-plu= gging 512G memory.

4. Then you can find that the i= mage in vnc is blocked and the worse thing is that the VM crashed.

 

Actually the vcpu is blocked be= cause of dead lock.

 

Analysis:

As when hot-pluging the BQL is = held, at the meanwhile, virtio-gpu is trying to hold the BQL for writing da= te. Then a vcpu is blocked waiting for hugepages hot-plugging, specifically= , waiting for touching pages. If the blocked vcpu stops for several seconds, the soft lockup will happen, if it= stops for a long time, e.g. 30s, the VM will crash.

 

I am wandering if there are som= e ideas to avoid VM soft lockup and even VM crash ?

 

Thank you!

kind regards!=

--_000_d9e62d4914a24b63af9f94a0e99b32c9huaweicom_--