From: Claudio Fontana <cfontana@suse.de>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
Juan Quintela <quintela@redhat.com>
Subject: Re: starting to look at qemu savevm performance, a first regression detected
Date: Mon, 7 Mar 2022 12:19:22 +0100 [thread overview]
Message-ID: <1f70a086-2b72-bd83-414b-476f5e6d0094@suse.de> (raw)
In-Reply-To: <YiXVh1P4oJNuEtFM@redhat.com>
On 3/7/22 10:51 AM, Daniel P. Berrangé wrote:
> On Mon, Mar 07, 2022 at 10:44:56AM +0100, Claudio Fontana wrote:
>> Hello Daniel,
>>
>> On 3/7/22 10:27 AM, Daniel P. Berrangé wrote:
>>> On Sat, Mar 05, 2022 at 02:19:39PM +0100, Claudio Fontana wrote:
>>>>
>>>> Hello all,
>>>>
>>>> I have been looking at some reports of bad qemu savevm performance in large VMs (around 20+ Gb),
>>>> when used in libvirt commands like:
>>>>
>>>>
>>>> virsh save domain /dev/null
>>>>
>>>>
>>>>
>>>> I have written a simple test to run in a Linux centos7-minimal-2009 guest, which allocates and touches 20G mem.
>>>>
>>>> With any qemu version since around 2020, I am not seeing more than 580 Mb/Sec even in the most ideal of situations.
>>>>
>>>> This drops to around 122 Mb/sec after commit: cbde7be900d2a2279cbc4becb91d1ddd6a014def .
>>>>
>>>> Here is the bisection for this particular drop in throughput:
>>>>
>>>> commit cbde7be900d2a2279cbc4becb91d1ddd6a014def (HEAD, refs/bisect/bad)
>>>> Author: Daniel P. Berrangé <berrange@redhat.com>
>>>> Date: Fri Feb 19 18:40:12 2021 +0000
>>>>
>>>> migrate: remove QMP/HMP commands for speed, downtime and cache size
>>>>
>>>> The generic 'migrate_set_parameters' command handle all types of param.
>>>>
>>>> Only the QMP commands were documented in the deprecations page, but the
>>>> rationale for deprecating applies equally to HMP, and the replacements
>>>> exist. Furthermore the HMP commands are just shims to the QMP commands,
>>>> so removing the latter breaks the former unless they get re-implemented.
>>>>
>>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
>>>
>>> That doesn't make a whole lot of sense as a bisect result.
>>> How reliable is that bisect end point ? Have you bisected
>>> to that point more than once ?
>>
>> I did run through the bisect itself only once, so I'll double check that.
>> The results seem to be reproducible almost to the second though, a savevm that took 35 seconds before the commit takes 2m 48 seconds after.
>>
>> For this test I am using libvirt v6.0.0.
>>
>> If it helps, these are the current_migration->parameters pre-commit (captured in qemu_savevm_state_iterate):
>>
>>
>> pre-commit: in qemu_savevm_state_iterate:
>>
>> (gdb) p current_migration->parameters
>
>> tls_authz = 0x0, has_max_bandwidth = true, max_bandwidth = 9223372036853727232, has_downtime_limit = true, downtime_limit = 300,
>
> snip
>
>> and post-commit: in qemu_savevm_state_iterate:
>>
>> (gdb) p current_migration->parameters
>
> snip
>
>> tls_authz = 0x0, has_max_bandwidth = true, max_bandwidth = 134217728, has_downtime_limit = true, downtime_limit = 300,
>
>> so there seems to be a difference in the max_bandwidth parameter,
>> do we have a limit suddenly having effect for max_bandwidth after the commit?
>
> Yes, that's very strange. I think we'll need to capture the QMP commands that
> libvirt is sending to QEMU, so see if there';s a difference in what it sends.
> This might indicate a latent bug in libvirt.
In the pre-commit case I see:
2022-03-07 10:41:00.928+0000: 132544: info : qemuMonitorJSONIOProcessLine:235 : QEMU_MONITOR_RECV_EVENT: mon=0x7f0fd00028a0 event={"timestamp": {"seconds": 1646649660, "microseconds": 927920}, "event": "STOP"}
2022-03-07 10:41:00.929+0000: 132544: info : qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: mon=0x7f0fd00028a0 reply={"return": {}, "id": "libvirt-13"}
2022-03-07 10:41:00.934+0000: 132549: info : qemuMonitorSend:995 : QEMU_MONITOR_SEND_MSG: mon=0x7f0fd00028a0 msg={"execute":"migrate_set_speed","arguments":{"value":9223372036853727232},"id":"libvirt-14"}^M
fd=-1
2022-03-07 10:41:00.934+0000: 132544: info : qemuMonitorIOWrite:452 : QEMU_MONITOR_IO_WRITE: mon=0x7f0fd00028a0 buf={"execute":"migrate_set_speed","arguments":{"value":9223372036853727232},"id":"libvirt-14"}^M
len=93 ret=93 errno=0
2022-03-07 10:41:00.935+0000: 132544: info : qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: mon=0x7f0fd00028a0 reply={"return": {}, "id": "libvirt-14"}
2022-03-07 10:41:00.936+0000: 132549: info : qemuMonitorSend:995 : QEMU_MONITOR_SEND_MSG: mon=0x7f0fd00028a0 msg={"execute":"getfd","arguments":{"fdname":"migrate"},"id":"libvirt-15"}^M
fd=32
2022-03-07 10:41:00.936+0000: 132544: info : qemuMonitorIOWrite:452 : QEMU_MONITOR_IO_WRITE: mon=0x7f0fd00028a0 buf={"execute":"getfd","arguments":{"fdname":"migrate"},"id":"libvirt-15"}^M
len=72 ret=72 errno=0
2022-03-07 10:41:00.936+0000: 132544: info : qemuMonitorIOWrite:457 : QEMU_MONITOR_IO_SEND_FD: mon=0x7f0fd00028a0 fd=32 ret=72 errno=0
2022-03-07 10:41:00.937+0000: 132544: info : qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: mon=0x7f0fd00028a0 reply={"return": {}, "id": "libvirt-15"}
2022-03-07 10:41:00.937+0000: 132549: info : qemuMonitorSend:995 : QEMU_MONITOR_SEND_MSG: mon=0x7f0fd00028a0 msg={"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false,"uri":"fd:migrate"},"id":"libvirt-16"}^M
fd=-1
2022-03-07 10:41:00.937+0000: 132544: info : qemuMonitorIOWrite:452 : QEMU_MONITOR_IO_WRITE: mon=0x7f0fd00028a0 buf={"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false,"uri":"fd:migrate"},"id":"libvirt-16"}^M
len=112 ret=112 errno=0
In the post-commit case I see:
2022-03-07 10:47:07.316+0000: 134386: info : qemuMonitorJSONIOProcessLine:235 : QEMU_MONITOR_RECV_EVENT: mon=0x7fa4380028a0 event={"timestamp": {"seconds": 1646650027, "microseconds": 316537}, "event": "STOP"}
2022-03-07 10:47:07.317+0000: 134386: info : qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: mon=0x7fa4380028a0 reply={"return": {}, "id": "libvirt-13"}
2022-03-07 10:47:07.322+0000: 134391: info : qemuMonitorSend:995 : QEMU_MONITOR_SEND_MSG: mon=0x7fa4380028a0 msg={"execute":"migrate_set_speed","arguments":{"value":9223372036853727232},"id":"libvirt-14"}^M
fd=-1
2022-03-07 10:47:07.322+0000: 134386: info : qemuMonitorIOWrite:452 : QEMU_MONITOR_IO_WRITE: mon=0x7fa4380028a0 buf={"execute":"migrate_set_speed","arguments":{"value":9223372036853727232},"id":"libvirt-14"}^M
len=93 ret=93 errno=0
2022-03-07 10:47:07.324+0000: 134386: info : qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: mon=0x7fa4380028a0 reply={"id": "libvirt-14", "error": {"class": "CommandNotFound", "desc": "The command migrate_set_speed has not been found"}}
2022-03-07 10:47:07.324+0000: 134391: error : qemuMonitorJSONCheckError:412 : internal error: unable to execute QEMU command 'migrate_set_speed': The command migrate_set_speed has not been found
2022-03-07 10:47:07.324+0000: 134391: info : qemuMonitorSend:995 : QEMU_MONITOR_SEND_MSG: mon=0x7fa4380028a0 msg={"execute":"getfd","arguments":{"fdname":"migrate"},"id":"libvirt-15"}^M
fd=32
2022-03-07 10:47:07.324+0000: 134386: info : qemuMonitorIOWrite:452 : QEMU_MONITOR_IO_WRITE: mon=0x7fa4380028a0 buf={"execute":"getfd","arguments":{"fdname":"migrate"},"id":"libvirt-15"}^M
len=72 ret=72 errno=0
2022-03-07 10:47:07.324+0000: 134386: info : qemuMonitorIOWrite:457 : QEMU_MONITOR_IO_SEND_FD: mon=0x7fa4380028a0 fd=32 ret=72 errno=0
2022-03-07 10:47:07.325+0000: 134386: info : qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: mon=0x7fa4380028a0 reply={"return": {}, "id": "libvirt-15"}
2022-03-07 10:47:07.326+0000: 134391: info : qemuMonitorSend:995 : QEMU_MONITOR_SEND_MSG: mon=0x7fa4380028a0 msg={"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false,"uri":"fd:migrate"},"id":"libvirt-16"}^M
fd=-1
2022-03-07 10:47:07.326+0000: 134386: info : qemuMonitorIOWrite:452 : QEMU_MONITOR_IO_WRITE: mon=0x7fa4380028a0 buf={"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false,"uri":"fd:migrate"},"id":"libvirt-16"}^M
len=112 ret=112 errno=0
2022-03-07 10:47:07.328+0000: 134386: info : qemuMonitorJSONIOProcessLine:235 : QEMU_MONITOR_RECV_EVENT: mon=0x7fa4380028a0 event={"timestamp": {"seconds": 1646650027, "microseconds": 327843}, "event": "MIGRATION", "data": {"status": "setup"}}
2022-03-07 10:47:07.328+0000: 134386: info : qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: mon=0x7fa4380028a0 reply={"return": {}, "id": "libvirt-16"}
2022-03-07 10:47:07.449+0000: 134386: info : qemuMonitorJSONIOProcessLine:235 : QEMU_MONITOR_RECV_EVENT: mon=0x7fa4380028a0 event={"timestamp": {"seconds": 1646650027, "microseconds": 449199}, "event": "MIGRATION_PASS", "data": {"pass": 1}}
2022-03-07 10:47:07.449+0000: 134386: info : qemuMonitorJSONIOProcessLine:235 : QEMU_MONITOR_RECV_EVENT: mon=0x7fa4380028a0 event={"timestamp": {"seconds": 1646650027, "microseconds": 449363}, "event": "MIGRATION", "data": {"status": "active"}}
2022-03-07 10:47:07.807+0000: 134387: info : qemuMonitorSend:995 : QEMU_MONITOR_SEND_MSG: mon=0x7fa4380028a0 msg={"execute":"query-migrate","id":"libvirt-17"}^M
fd=-1
2022-03-07 10:47:07.807+0000: 134386: info : qemuMonitorIOWrite:452 : QEMU_MONITOR_IO_WRITE: mon=0x7fa4380028a0 buf={"execute":"query-migrate","id":"libvirt-17"}^M
len=47 ret=47 errno=0
2022-03-07 10:47:07.809+0000: 134386: info : qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: mon=0x7fa4380028a0 reply={"return": {"blocked": false, "expected-downtime": 300, "status": "active", "setup-time": 121, "total-time": 481, "ram": {"total": 32213049344, "postcopy-requests": 0, "dirty-sync-count": 1, "multifd-bytes": 0, "pages-per-second": 971380, "page-size": 4096, "remaining": 31357165568, "mbps": 70.597440000000006, "transferred": 28723376, "duplicate": 202401, "dirty-pages-rate": 0, "skipped": 0, "normal-bytes": 26849280, "normal": 6555}}, "id": "libvirt-17"}
2022-03-07 10:47:20.063+0000: 134386: info : qemuMonitorJSONIOProcessLine:235 : QEMU_MONITOR_RECV_EVENT: mon=0x7fa4380028a0 event={"timestamp": {"seconds": 1646650040, "microseconds": 63299}, "event": "MIGRATION_PASS", "data": {"pass": 2}}
2022-03-07 10:47:20.068+0000: 134386: info : qemuMonitorJSONIOProcessLine:235 : QEMU_MONITOR_RECV_EVENT: mon=0x7fa4380028a0 event={"timestamp": {"seconds": 1646650040, "microseconds": 68660}, "event": "MIGRATION_PASS", "data": {"pass": 3}}
2022-03-07 10:47:20.142+0000: 134386: info : qemuMonitorJSONIOProcessLine:235 : QEMU_MONITOR_RECV_EVENT: mon=0x7fa4380028a0 event={"timestamp": {"seconds": 1646650040, "microseconds": 142735}, "event": "MIGRATION", "data": {"status": "completed"}}
2022-03-07 10:47:20.143+0000: 134391: info : qemuMonitorSend:995 : QEMU_MONITOR_SEND_MSG: mon=0x7fa4380028a0 msg={"execute":"query-migrate","id":"libvirt-18"}^M
fd=-1
2022-03-07 10:47:20.143+0000: 134386: info : qemuMonitorIOWrite:452 : QEMU_MONITOR_IO_WRITE: mon=0x7fa4380028a0 buf={"execute":"query-migrate","id":"libvirt-18"}^M
len=47 ret=47 errno=0
2022-03-07 10:47:20.145+0000: 134386: info : qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: mon=0x7fa4380028a0 reply={"return": {"blocked": false, "status": "completed", "setup-time": 121, "downtime": 79, "total-time": 12815, "ram": {"total": 32213049344, "postcopy-requests": 0, "dirty-sync-count": 3, "multifd-bytes": 0, "pages-per-second": 32710, "page-size": 4096, "remaining": 0, "mbps": 584.63040491570825, "transferred": 927267953, "duplicate": 7655360, "dirty-pages-rate": 0, "skipped": 0, "normal-bytes": 856694784, "normal": 209154}}, "id": "libvirt-18"}
2022-03-07 10:47:20.145+0000: 134391: info : qemuMonitorSend:995 : QEMU_MONITOR_SEND_MSG: mon=0x7fa4380028a0 msg={"execute":"migrate_set_speed","arguments":{"value":9223372036853727232},"id":"libvirt-19"}^M
fd=-1
2022-03-07 10:47:20.145+0000: 134386: info : qemuMonitorIOWrite:452 : QEMU_MONITOR_IO_WRITE: mon=0x7fa4380028a0 buf={"execute":"migrate_set_speed","arguments":{"value":9223372036853727232},"id":"libvirt-19"}^M
len=93 ret=93 errno=0
2022-03-07 10:47:20.146+0000: 134386: info : qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: mon=0x7fa4380028a0 reply={"id": "libvirt-19", "error": {"class": "CommandNotFound", "desc": "The command migrate_set_speed has not been found"}}
2022-03-07 10:47:20.147+0000: 134391: error : qemuMonitorJSONCheckError:412 : internal error: unable to execute QEMU command 'migrate_set_speed': The command migrate_set_speed has not been found
2022-03-07 10:47:20.150+0000: 134391: info : qemuMonitorClose:917 : QEMU_MONITOR_CLOSE: mon=0x7fa4380028a0 refs=2
>
> If you libvirt_log_filters=2:qemu_monitor then it ought to capture the
> QMP commands.
>
> Regards,
> Daniel
>
next parent reply other threads:[~2022-03-07 12:52 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <8826b03d-e5e9-0e65-cab7-ea1829f48e6c@suse.de>
[not found] ` <YiXQHIWtHx5BocxK@redhat.com>
[not found] ` <62ba8b1e-d641-5b10-c1b3-54b7d5a652e7@suse.de>
[not found] ` <YiXVh1P4oJNuEtFM@redhat.com>
2022-03-07 11:19 ` Claudio Fontana [this message]
2022-03-07 12:00 ` starting to look at qemu savevm performance, a first regression detected Daniel P. Berrangé
2022-03-07 12:09 ` Claudio Fontana
2022-03-07 12:20 ` Daniel P. Berrangé
2022-03-07 12:26 ` Claudio Fontana
2022-03-07 12:28 ` Dr. David Alan Gilbert
2022-03-09 11:15 ` bad qemu savevm to /dev/null performance (600 MiB/s max) (Was: Re: starting to look at qemu savevm performance, a first regression detected) Claudio Fontana
2022-03-09 11:43 ` Dr. David Alan Gilbert
2022-03-09 11:51 ` Daniel P. Berrangé
2022-03-09 14:22 ` Claudio Fontana
2022-03-09 18:27 ` bad virsh save /dev/null performance (600 MiB/s max) Claudio Fontana
2022-03-09 18:37 ` Daniel P. Berrangé
2022-03-09 18:39 ` Dr. David Alan Gilbert
2022-03-09 18:47 ` Claudio Fontana
2022-03-09 18:53 ` Dr. David Alan Gilbert
2022-03-09 18:46 ` Daniel P. Berrangé
2022-03-10 15:25 ` Claudio Fontana
2022-03-09 13:16 ` bad qemu savevm to /dev/null performance (600 MiB/s max) (Was: Re: starting to look at qemu savevm performance, a first regression detected) Claudio Fontana
2022-03-05 13:20 starting to look at qemu savevm performance, a first regression detected Claudio Fontana
2022-03-05 14:11 ` Claudio Fontana
2022-03-07 10:32 ` Dr. David Alan Gilbert
2022-03-07 11:06 ` Claudio Fontana
2022-03-07 11:31 ` Dr. David Alan Gilbert
2022-03-07 12:07 ` Claudio Fontana
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1f70a086-2b72-bd83-414b-476f5e6d0094@suse.de \
--to=cfontana@suse.de \
--cc=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).