From: Zhenyu Ye <yezhenyu2@huawei.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
fam@euphon.net, qemu-block@nongnu.org, qemu-devel@nongnu.org,
xiexiangyou@huawei.com, armbru@redhat.com, stefanha@redhat.com,
pbonzini@redhat.com, mreitz@redhat.com
Subject: Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Date: Thu, 17 Sep 2020 16:12:18 +0800 [thread overview]
Message-ID: <cfbbe585-c252-bc4c-23d7-7e7cfc6224cf@huawei.com> (raw)
In-Reply-To: <20200914144251.GO1252186@redhat.com>
Hi Daniel,
On 2020/9/14 22:42, Daniel P. Berrangé wrote:
> On Tue, Aug 11, 2020 at 09:54:08PM +0800, Zhenyu Ye wrote:
>> Hi Kevin,
>>
>> On 2020/8/10 23:38, Kevin Wolf wrote:
>>> Am 10.08.2020 um 16:52 hat Zhenyu Ye geschrieben:
>>>> Before doing qmp actions, we need to lock the qemu_global_mutex,
>>>> so the qmp actions should not take too long time.
>>>>
>>>> Unfortunately, some qmp actions need to acquire aio context and
>>>> this may take a long time. The vm will soft lockup if this time
>>>> is too long.
>>>
>>> Do you have a specific situation in mind where getting the lock of an
>>> AioContext can take a long time? I know that the main thread can
>>> block for considerable time, but QMP commands run in the main thread, so
>>> this patch doesn't change anything for this case. It would be effective
>>> if an iothread blocks, but shouldn't everything running in an iothread
>>> be asynchronous and therefore keep the AioContext lock only for a short
>>> time?
>>>
>>
>> Theoretically, everything running in an iothread is asynchronous. However,
>> some 'asynchronous' actions are not non-blocking entirely, such as
>> io_submit(). This will block while the iodepth is too big and I/O pressure
>> is too high. If we do some qmp actions, such as 'info block', at this time,
>> may cause vm soft lockup. This series can make these qmp actions safer.
>>
>> I constructed the scene as follow:
>> 1. create a vm with 4 disks, using iothread.
>> 2. add press to the CPU on the host. In my scene, the CPU usage exceeds 95%.
>> 3. add press to the 4 disks in the vm at the same time. I used the fio and
>> some parameters are:
>>
>> fio -rw=randrw -bs=1M -size=1G -iodepth=512 -ioengine=libaio -numjobs=4
>>
>> 4. do block query actions, for example, by virsh:
>>
>> virsh qemu-monitor-command [vm name] --hmp info block
>>
>> Then the vm will soft lockup, the calltrace is:
>
> [snip]
>
>> This problem can be avoided after this series applied.
>
> At what cost though ? With this timeout, QEMU is going to start
> reporting bogus failures for various QMP commands when running
> under high load, even if those commands would actually run
> successfully. This will turn into an error report from libvirt
> which will in turn probably cause an error in the mgmt application
> using libvirt, and in turn could break the user's automation.
>
I think it's worth reporting an error to avoid the VM softlockup.
The VM may even crash if kernel.softlockup_panic is configured!
We can increase the timeout value (close to the VM cpu soft lock time)
to avoid unnecessary errors.
Thanks,
Zhenyu
next prev parent reply other threads:[~2020-09-17 8:20 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-10 14:52 [PATCH v1 0/2] Add timeout mechanism to qmp actions Zhenyu Ye
2020-08-10 14:52 ` [PATCH v1 1/2] util: introduce aio_context_acquire_timeout Zhenyu Ye
2020-08-10 14:52 ` [PATCH v1 2/2] qmp: use aio_context_acquire_timeout replace aio_context_acquire Zhenyu Ye
2020-08-10 15:38 ` [PATCH v1 0/2] Add timeout mechanism to qmp actions Kevin Wolf
2020-08-11 13:54 ` Zhenyu Ye
2020-08-21 12:52 ` Stefan Hajnoczi
2020-09-14 13:27 ` Stefan Hajnoczi
2020-09-17 7:36 ` Zhenyu Ye
2020-09-17 10:10 ` Fam Zheng
2020-09-17 15:44 ` Stefan Hajnoczi
2020-09-17 16:01 ` Fam Zheng
2020-09-18 11:23 ` Zhenyu Ye
2020-09-18 14:06 ` Fam Zheng
2020-09-19 2:22 ` Zhenyu Ye
2020-09-21 11:14 ` Fam Zheng
2020-10-13 10:00 ` Stefan Hajnoczi
2020-10-19 12:40 ` Zhenyu Ye
2020-10-19 13:25 ` Paolo Bonzini
2020-10-20 1:34 ` Zhenyu Ye
2020-10-22 16:29 ` Fam Zheng
2020-12-08 13:10 ` Stefan Hajnoczi
2020-12-08 13:47 ` Glauber Costa
2020-12-14 16:33 ` Stefan Hajnoczi
2020-12-21 11:30 ` Zhenyu Ye
2020-09-14 14:42 ` Daniel P. Berrangé
2020-09-17 8:12 ` Zhenyu Ye [this message]
2020-08-12 13:51 ` Stefan Hajnoczi
2020-08-13 1:51 ` Zhenyu Ye
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cfbbe585-c252-bc4c-23d7-7e7cfc6224cf@huawei.com \
--to=yezhenyu2@huawei.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=fam@euphon.net \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=xiexiangyou@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).