From: Fiona Ebner <f.ebner@proxmox.com>
To: quintela@redhat.com
Cc: qemu-devel@nongnu.org,
"Daniel Henrique Barboza" <danielhb413@gmail.com>,
"Leonardo Bras" <leobras@redhat.com>,
"David Hildenbrand" <david@redhat.com>,
"Christian Borntraeger" <borntraeger@linux.ibm.com>,
qemu-s390x@nongnu.org, "Stefan Hajnoczi" <stefanha@redhat.com>,
"Thomas Huth" <thuth@redhat.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Eric Farman" <farman@linux.ibm.com>,
"David Gibson" <david@gibson.dropbear.id.au>,
"Ilya Leoshkevich" <iii@linux.ibm.com>,
"Fam Zheng" <fam@euphon.net>,
"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Greg Kurz" <groug@kaod.org>, "Halil Pasic" <pasic@linux.ibm.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
qemu-ppc@nongnu.org, "Peter Xu" <peterx@redhat.com>,
"Harsh Prateek Bora" <harshpb@linux.ibm.com>,
"Hailiang Zhang" <zhanghailiang@xfusion.com>,
"John Snow" <jsnow@redhat.com>, "Cédric Le Goater" <clg@kaod.org>,
qemu-block@nongnu.org, "Eric Blake" <eblake@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"Daniel P. Berrangé" <berrange@redhat.com>
Subject: Re: [PULL 09/12] migration: Use migration_transferred_bytes() to calculate rate_limit
Date: Fri, 26 May 2023 13:37:18 +0200 [thread overview]
Message-ID: <b1072885-8257-92a7-5328-6f3d40290196@proxmox.com> (raw)
In-Reply-To: <877csv3215.fsf@secure.mitica>
Am 26.05.23 um 10:55 schrieb Juan Quintela:
> Fiona Ebner <f.ebner@proxmox.com> wrote:
>> Am 18.05.23 um 19:13 schrieb Juan Quintela:
>>> diff --git a/migration/migration-stats.c b/migration/migration-stats.c
>>> index feec7d7369..97759a45f3 100644
>>> --- a/migration/migration-stats.c
>>> +++ b/migration/migration-stats.c
>>> @@ -24,7 +24,9 @@ bool migration_rate_exceeded(QEMUFile *f)
>>> return true;
>>> }
>>>
>>> - uint64_t rate_limit_used = stat64_get(&mig_stats.rate_limit_used);
>>> + uint64_t rate_limit_start = stat64_get(&mig_stats.rate_limit_start);
>>> + uint64_t rate_limit_current = migration_transferred_bytes(f);
>>> + uint64_t rate_limit_used = rate_limit_current - rate_limit_start;
>>> uint64_t rate_limit_max = stat64_get(&mig_stats.rate_limit_max);
>>>
>>> if (rate_limit_max == RATE_LIMIT_DISABLED) {
>>
>> Hi,
>> just wanted to let you know that the call to
>> migration_transferred_bytes(f) here can introduce a huge performance
>> penalty when taking a snapshot. I ran into the issue when testing
>> something else, with a single-disk snapshot. Without this call it takes
>> about two seconds, with the call about two minutes.
>
> Ouch.
>
> Now that everything is reviewed for that series I can sent the new set
> of patches. As I drop the counter it should just get the speed back.
>
> New series comming that removed rate_limit counter altogether.
>
> Can you take a look after I send it?>
Sure. Please CC me, so I don't forget :)
> Thanks for the report.
>
> And now that we are at it. How are you testing this?
I ran into it while testing the "BQL during save setup" patch [0] on (at
that time current) master.
> As you appears to be the bigger user of snapshots (or at least the
> louder).
We do have quite a few users using snapshots, but it's actually not the
upstream snapshot-save, but a custom QMP command based on it and
migration code-wise. We do keep the VM running while taking the
snapshot, and write the state to a new raw file/block device. When it's
ready, the VM is stopped, the management software takes snapshots of the
disks (via QEMU for qcow2/RBD and via the storage layer for other
storages) and then the VM is resumed.
The caveat is that the saved state will be the one right before the
snapshot finished, not the one when the snapshot started (as with
background migration). If you really want to see the historically grown
details: [1].
It's much older than background-snapshot and we probably could switch to
using background-snapshot with some (minor) changes. I do plan to
evaluate that at some point, but haven't had the time yet.
> Creating tests/qtest/snapshot-test.c could be a good idea.
I can try to take a look at adding tests at some point, but I will be
very busy in the near future, because we have releases coming up. So if
anybody else wants to do it, don't wait for me :)
> 1st to check this kind of breakage.
> 2nd so I be sure that we don't "pesimize" your use case.
>
> Hint, hint.
>
> 2 seconds vs 2 minutes.
>
> A more detailed explanation that what are you doing would be great.
> I.e. you are taking lots of snapshots by second or what?
No, it was just a single snapshot on a single drive for a VM with 2GiB
of RAM, nothing special. I also just "measured" it by counting, I
thought exact values won't matter when it's such a big difference.
Here's the command line:
> ./qemu-system-x86_64 \
> -chardev 'socket,id=qmp,path=/var/run/qemu-server/106.qmp,server=on,wait=off' \
> -mon 'chardev=qmp,mode=control' \
> -pidfile /var/run/qemu-server/106.pid \
> -smp '2,sockets=2,cores=1,maxcpus=2' \
> -nodefaults \
> -vnc 'unix:/var/run/qemu-server/106.vnc,password=on' \
> -m 2048 \
> -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
> -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
> -drive 'file=/var/lib/vz/images/106/vm-106-disk-1.qcow2,if=none,id=drive-scsi0,format=qcow2,cache=none,aio=io_uring,detect-zeroes=on,node-name=scsi0' \
> -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' \
> -machine 'type=pc'
and QMP params:
> "snapshot-save",
> 'job-id' => "save$i",
> tag => 'snap0',
> vmstate => 'scsi0',
> devices => ['scsi0'],
[0]: https://lists.nongnu.org/archive/html/qemu-devel/2023-05/msg06541.html
[1]:
https://git.proxmox.com/?p=pve-qemu.git;a=blob;f=debian/patches/pve/0018-PVE-add-savevm-async-for-background-state-snapshots.patch;h=d42d0f02f270934610b1ac90b2813b5ee617427d;hb=bd3c1fa52561e4a307e5b5b37754788408fc75a6
Best Regards,
Fiona
next prev parent reply other threads:[~2023-05-26 11:39 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-18 17:12 [PULL 00/12] Migration 20230518 patches Juan Quintela
2023-05-18 17:12 ` [PULL 01/12] configure: add --disable-colo-proxy option Juan Quintela
2023-05-18 17:12 ` [PULL 02/12] migration: split migration_incoming_co Juan Quintela
2023-05-18 17:12 ` [PULL 03/12] migration: process_incoming_migration_co(): move colo part to colo Juan Quintela
2023-05-18 17:12 ` [PULL 04/12] migration: Don't use INT64_MAX for unlimited rate Juan Quintela
2023-05-18 17:12 ` [PULL 05/12] qemu-file: Account for rate_limit usage on qemu_fflush() Juan Quintela
2023-05-18 17:12 ` [PULL 06/12] migration: Move rate_limit_max and rate_limit_used to migration_stats Juan Quintela
2023-05-18 17:12 ` [PULL 07/12] migration: Move migration_total_bytes() to migration-stats.c Juan Quintela
2023-05-18 17:13 ` [PULL 08/12] migration: Add a trace for migration_transferred_bytes Juan Quintela
2023-05-18 17:13 ` [PULL 09/12] migration: Use migration_transferred_bytes() to calculate rate_limit Juan Quintela
2023-05-23 12:31 ` Fiona Ebner
2023-05-26 8:55 ` Juan Quintela
2023-05-26 11:37 ` Fiona Ebner [this message]
2023-07-28 11:48 ` Fiona Ebner
2023-05-18 17:13 ` [PULL 10/12] migration: We don't need the field rate_limit_used anymore Juan Quintela
2023-05-18 17:13 ` [PULL 11/12] migration/multifd: Compute transferred bytes correctly Juan Quintela
2023-05-18 17:13 ` [PULL 12/12] migration: Fix duplicated included in meson.build Juan Quintela
2023-05-19 3:42 ` [PULL 00/12] Migration 20230518 patches Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b1072885-8257-92a7-5328-6f3d40290196@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=berrange@redhat.com \
--cc=borntraeger@linux.ibm.com \
--cc=clg@kaod.org \
--cc=danielhb413@gmail.com \
--cc=david@gibson.dropbear.id.au \
--cc=david@redhat.com \
--cc=eblake@redhat.com \
--cc=fam@euphon.net \
--cc=farman@linux.ibm.com \
--cc=groug@kaod.org \
--cc=harshpb@linux.ibm.com \
--cc=iii@linux.ibm.com \
--cc=jasowang@redhat.com \
--cc=jsnow@redhat.com \
--cc=leobras@redhat.com \
--cc=marcandre.lureau@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=qemu-s390x@nongnu.org \
--cc=quintela@redhat.com \
--cc=richard.henderson@linaro.org \
--cc=stefanha@redhat.com \
--cc=thuth@redhat.com \
--cc=vsementsov@yandex-team.ru \
--cc=zhanghailiang@xfusion.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).