qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Denis V. Lunev" <den@virtuozzo.com>
To: Steven Sistare <steven.sistare@oracle.com>,
	Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>,
	qemu-devel@nongnu.org
Cc: "William Roche" <william.roche@oracle.com>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>
Subject: Re: [BUG, RFC] cpr-transfer: qxl guest driver crashes after migration
Date: Thu, 6 Mar 2025 10:59:30 +0100	[thread overview]
Message-ID: <ab3e61fc-a79f-4c6e-b8e9-d7b501c84644@virtuozzo.com> (raw)
In-Reply-To: <76d393fa-4c40-4331-aa5e-381f34c60c3a@oracle.com>

On 3/5/25 22:19, Steven Sistare wrote:
> On 3/5/2025 11:50 AM, Andrey Drobyshev wrote:
>> On 3/4/25 9:05 PM, Steven Sistare wrote:
>>> On 2/28/2025 1:37 PM, Andrey Drobyshev wrote:
>>>> On 2/28/25 8:35 PM, Andrey Drobyshev wrote:
>>>>> On 2/28/25 8:20 PM, Steven Sistare wrote:
>>>>>> On 2/28/2025 1:13 PM, Steven Sistare wrote:
>>>>>>> On 2/28/2025 12:39 PM, Andrey Drobyshev wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> We've been experimenting with cpr-transfer migration mode recently
>>>>>>>> and
>>>>>>>> have discovered the following issue with the guest QXL driver:
>>>>>>>>
>>>>>>>> Run migration source:
>>>>>>>>> EMULATOR=/path/to/emulator
>>>>>>>>> ROOTFS=/path/to/image
>>>>>>>>> QMPSOCK=/var/run/alma8qmp-src.sock
>>>>>>>>>
>>>>>>>>> $EMULATOR -enable-kvm \
>>>>>>>>>        -machine q35 \
>>>>>>>>>        -cpu host -smp 2 -m 2G \
>>>>>>>>>        -object 
>>>>>>>>> memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
>>>>>>>>> ram0,share=on\
>>>>>>>>>        -machine memory-backend=ram0 \
>>>>>>>>>        -machine aux-ram-share=on \
>>>>>>>>>        -drive file=$ROOTFS,media=disk,if=virtio \
>>>>>>>>>        -qmp unix:$QMPSOCK,server=on,wait=off \
>>>>>>>>>        -nographic \
>>>>>>>>>        -device qxl-vga
>>>>>>>>
>>>>>>>> Run migration target:
>>>>>>>>> EMULATOR=/path/to/emulator
>>>>>>>>> ROOTFS=/path/to/image
>>>>>>>>> QMPSOCK=/var/run/alma8qmp-dst.sock
>>>>>>>>> $EMULATOR -enable-kvm \
>>>>>>>>>        -machine q35 \
>>>>>>>>>        -cpu host -smp 2 -m 2G \
>>>>>>>>>        -object 
>>>>>>>>> memory-backend-file,id=ram0,size=2G,mem-path=/dev/shm/
>>>>>>>>> ram0,share=on\
>>>>>>>>>        -machine memory-backend=ram0 \
>>>>>>>>>        -machine aux-ram-share=on \
>>>>>>>>>        -drive file=$ROOTFS,media=disk,if=virtio \
>>>>>>>>>        -qmp unix:$QMPSOCK,server=on,wait=off \
>>>>>>>>>        -nographic \
>>>>>>>>>        -device qxl-vga \
>>>>>>>>>        -incoming tcp:0:44444 \
>>>>>>>>>        -incoming '{"channel-type": "cpr", "addr": { "transport":
>>>>>>>>> "socket", "type": "unix", "path": "/var/run/alma8cpr-dst.sock"}}'
>>>>>>>>
>>>>>>>>
>>>>>>>> Launch the migration:
>>>>>>>>> QMPSHELL=/root/src/qemu/master/scripts/qmp/qmp-shell
>>>>>>>>> QMPSOCK=/var/run/alma8qmp-src.sock
>>>>>>>>>
>>>>>>>>> $QMPSHELL -p $QMPSOCK <<EOF
>>>>>>>>>        migrate-set-parameters mode=cpr-transfer
>>>>>>>>>        migrate channels=[{"channel-type":"main","addr":
>>>>>>>>> {"transport":"socket","type":"inet","host":"0","port":"44444"}},
>>>>>>>>> {"channel-type":"cpr","addr":
>>>>>>>>> {"transport":"socket","type":"unix","path":"/var/run/alma8cpr-
>>>>>>>>> dst.sock"}}]
>>>>>>>>> EOF
>>>>>>>>
>>>>>>>> Then, after a while, QXL guest driver on target crashes spewing 
>>>>>>>> the
>>>>>>>> following messages:
>>>>>>>>> [   73.962002] [TTM] Buffer eviction failed
>>>>>>>>> [   73.962072] qxl 0000:00:02.0: object_init failed for (3149824,
>>>>>>>>> 0x00000001)
>>>>>>>>> [   73.962081] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* 
>>>>>>>>> failed to
>>>>>>>>> allocate VRAM BO
>>>>>>>>
>>>>>>>> That seems to be a known kernel QXL driver bug:
>>>>>>>>
>>>>>>>> https://lore.kernel.org/all/20220907094423.93581-1-
>>>>>>>> min_halo@163.com/T/
>>>>>>>> https://lore.kernel.org/lkml/ZTgydqRlK6WX_b29@eldamar.lan/
>>>>>>>>
>>>>>>>> (the latter discussion contains that reproduce script which 
>>>>>>>> speeds up
>>>>>>>> the crash in the guest):
>>>>>>>>> #!/bin/bash
>>>>>>>>>
>>>>>>>>> chvt 3
>>>>>>>>>
>>>>>>>>> for j in $(seq 80); do
>>>>>>>>>            echo "$(date) starting round $j"
>>>>>>>>>            if [ "$(journalctl --boot | grep "failed to 
>>>>>>>>> allocate VRAM
>>>>>>>>> BO")" != "" ]; then
>>>>>>>>>                    echo "bug was reproduced after $j tries"
>>>>>>>>>                    exit 1
>>>>>>>>>            fi
>>>>>>>>>            for i in $(seq 100); do
>>>>>>>>>                    dmesg > /dev/tty3
>>>>>>>>>            done
>>>>>>>>> done
>>>>>>>>>
>>>>>>>>> echo "bug could not be reproduced"
>>>>>>>>> exit 0
>>>>>>>>
>>>>>>>> The bug itself seems to remain unfixed, as I was able to reproduce
>>>>>>>> that
>>>>>>>> with Fedora 41 guest, as well as AlmaLinux 8 guest. However our
>>>>>>>> cpr-transfer code also seems to be buggy as it triggers the 
>>>>>>>> crash -
>>>>>>>> without the cpr-transfer migration the above reproduce doesn't
>>>>>>>> lead to
>>>>>>>> crash on the source VM.
>>>>>>>>
>>>>>>>> I suspect that, as cpr-transfer doesn't migrate the guest 
>>>>>>>> memory, but
>>>>>>>> rather passes it through the memory backend object, our code might
>>>>>>>> somehow corrupt the VRAM.  However, I wasn't able to trace the
>>>>>>>> corruption so far.
>>>>>>>>
>>>>>>>> Could somebody help the investigation and take a look into 
>>>>>>>> this?  Any
>>>>>>>> suggestions would be appreciated.  Thanks!
>>>>>>>
>>>>>>> Possibly some memory region created by qxl is not being preserved.
>>>>>>> Try adding these traces to see what is preserved:
>>>>>>>
>>>>>>> -trace enable='*cpr*'
>>>>>>> -trace enable='*ram_alloc*'
>>>>>>
>>>>>> Also try adding this patch to see if it flags any ram blocks as not
>>>>>> compatible with cpr.  A message is printed at migration start time.
>>>>>> https://lore.kernel.org/qemu-devel/1740667681-257312-1-git-send-
>>>>>> email-
>>>>>> steven.sistare@oracle.com/
>>>>>>
>>>>>> - Steve
>>>>>>
>>>>>
>>>>> With the traces enabled + the "migration: ram block cpr blockers" 
>>>>> patch
>>>>> applied:
>>>>>
>>>>> Source:
>>>>>> cpr_find_fd pc.bios, id 0 returns -1
>>>>>> cpr_save_fd pc.bios, id 0, fd 22
>>>>>> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 22 host
>>>>>> 0x7fec18e00000
>>>>>> cpr_find_fd pc.rom, id 0 returns -1
>>>>>> cpr_save_fd pc.rom, id 0, fd 23
>>>>>> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 23 host
>>>>>> 0x7fec18c00000
>>>>>> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns -1
>>>>>> cpr_save_fd 0000:00:01.0/e1000e.rom, id 0, fd 24
>>>>>> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
>>>>>> 262144 fd 24 host 0x7fec18a00000
>>>>>> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns -1
>>>>>> cpr_save_fd 0000:00:02.0/vga.vram, id 0, fd 25
>>>>>> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
>>>>>> 67108864 fd 25 host 0x7feb77e00000
>>>>>> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns -1
>>>>>> cpr_save_fd 0000:00:02.0/qxl.vrom, id 0, fd 27
>>>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
>>>>>> fd 27 host 0x7fec18800000
>>>>>> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns -1
>>>>>> cpr_save_fd 0000:00:02.0/qxl.vram, id 0, fd 28
>>>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
>>>>>> 67108864 fd 28 host 0x7feb73c00000
>>>>>> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns -1
>>>>>> cpr_save_fd 0000:00:02.0/qxl.rom, id 0, fd 34
>>>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
>>>>>> fd 34 host 0x7fec18600000
>>>>>> cpr_find_fd /rom@etc/acpi/tables, id 0 returns -1
>>>>>> cpr_save_fd /rom@etc/acpi/tables, id 0, fd 35
>>>>>> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
>>>>>> 2097152 fd 35 host 0x7fec18200000
>>>>>> cpr_find_fd /rom@etc/table-loader, id 0 returns -1
>>>>>> cpr_save_fd /rom@etc/table-loader, id 0, fd 36
>>>>>> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
>>>>>> fd 36 host 0x7feb8b600000
>>>>>> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns -1
>>>>>> cpr_save_fd /rom@etc/acpi/rsdp, id 0, fd 37
>>>>>> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
>>>>>> 37 host 0x7feb8b400000
>>>>>>
>>>>>> cpr_state_save cpr-transfer mode
>>>>>> cpr_transfer_output /var/run/alma8cpr-dst.sock
>>>>>
>>>>> Target:
>>>>>> cpr_transfer_input /var/run/alma8cpr-dst.sock
>>>>>> cpr_state_load cpr-transfer mode
>>>>>> cpr_find_fd pc.bios, id 0 returns 20
>>>>>> qemu_ram_alloc_shared pc.bios size 262144 max_size 262144 fd 20 host
>>>>>> 0x7fcdc9800000
>>>>>> cpr_find_fd pc.rom, id 0 returns 19
>>>>>> qemu_ram_alloc_shared pc.rom size 131072 max_size 131072 fd 19 host
>>>>>> 0x7fcdc9600000
>>>>>> cpr_find_fd 0000:00:01.0/e1000e.rom, id 0 returns 18
>>>>>> qemu_ram_alloc_shared 0000:00:01.0/e1000e.rom size 262144 max_size
>>>>>> 262144 fd 18 host 0x7fcdc9400000
>>>>>> cpr_find_fd 0000:00:02.0/vga.vram, id 0 returns 17
>>>>>> qemu_ram_alloc_shared 0000:00:02.0/vga.vram size 67108864 max_size
>>>>>> 67108864 fd 17 host 0x7fcd27e00000
>>>>>> cpr_find_fd 0000:00:02.0/qxl.vrom, id 0 returns 16
>>>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vrom size 8192 max_size 8192
>>>>>> fd 16 host 0x7fcdc9200000
>>>>>> cpr_find_fd 0000:00:02.0/qxl.vram, id 0 returns 15
>>>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.vram size 67108864 max_size
>>>>>> 67108864 fd 15 host 0x7fcd23c00000
>>>>>> cpr_find_fd 0000:00:02.0/qxl.rom, id 0 returns 14
>>>>>> qemu_ram_alloc_shared 0000:00:02.0/qxl.rom size 65536 max_size 65536
>>>>>> fd 14 host 0x7fcdc8800000
>>>>>> cpr_find_fd /rom@etc/acpi/tables, id 0 returns 13
>>>>>> qemu_ram_alloc_shared /rom@etc/acpi/tables size 131072 max_size
>>>>>> 2097152 fd 13 host 0x7fcdc8400000
>>>>>> cpr_find_fd /rom@etc/table-loader, id 0 returns 11
>>>>>> qemu_ram_alloc_shared /rom@etc/table-loader size 4096 max_size 65536
>>>>>> fd 11 host 0x7fcdc8200000
>>>>>> cpr_find_fd /rom@etc/acpi/rsdp, id 0 returns 10
>>>>>> qemu_ram_alloc_shared /rom@etc/acpi/rsdp size 4096 max_size 4096 fd
>>>>>> 10 host 0x7fcd3be00000
>>>>>
>>>>> Looks like both vga.vram and qxl.vram are being preserved (with 
>>>>> the same
>>>>> addresses), and no incompatible ram blocks are found during 
>>>>> migration.
>>>>
>>>> Sorry, addressed are not the same, of course.  However 
>>>> corresponding ram
>>>> blocks do seem to be preserved and initialized.
>>>
>>> So far, I have not reproduced the guest driver failure.
>>>
>>> However, I have isolated places where new QEMU improperly writes to
>>> the qxl memory regions prior to starting the guest, by mmap'ing them
>>> readonly after cpr:
>>>
>>>    qemu_ram_alloc_internal()
>>>      if (reused && (strstr(name, "qxl") || strstr("name", "vga")))
>>>          ram_flags |= RAM_READONLY;
>>>      new_block = qemu_ram_alloc_from_fd(...)
>>>
>>> I have attached a draft fix; try it and let me know.
>>> My console window looks fine before and after cpr, using
>>> -vnc $hostip:0 -vga qxl
>>>
>>> - Steve
>>
>> Regarding the reproduce: when I launch the buggy version with the same
>> options as you, i.e. "-vnc 0.0.0.0:$port -vga qxl", and do cpr-transfer,
>> my VNC client silently hangs on the target after a while.  Could it
>> happen on your stand as well? 
>
> cpr does not preserve the vnc connection and session.  To test, I specify
> port 0 for the source VM and port 1 for the dest.  When the src vnc goes
> dormant the dest vnc becomes active.
>
>> Could you try launching VM with
>> "-nographic -device qxl-vga"?  That way VM's serial console is given you
>> directly in the shell, so when qxl driver crashes you're still able to
>> inspect the kernel messages.
>
> I have been running like that, but have not reproduced the qxl driver 
> crash,
> and I suspect my guest image+kernel is too old.  However, once I 
> realized the
> issue was post-cpr modification of qxl memory, I switched my attention 
> to the
> fix.
>
>> As for your patch, I can report that it doesn't resolve the issue as it
>> is.  But I was able to track down another possible memory corruption
>> using your approach with readonly mmap'ing:
>>
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
>>> 412         d->ram->magic       = cpu_to_le32(QXL_RAM_MAGIC);
>>> [Current thread is 1 (Thread 0x7f1a4f83b480 (LWP 229798))]
>>> (gdb) bt
>>> #0  init_qxl_ram (d=0x5638996e0e70) at ../hw/display/qxl.c:412
>>> #1  0x0000563896e7f467 in qxl_realize_common (qxl=0x5638996e0e70, 
>>> errp=0x7ffd3c2b8170) at ../hw/display/qxl.c:2142
>>> #2  0x0000563896e7fda1 in qxl_realize_primary (dev=0x5638996e0e70, 
>>> errp=0x7ffd3c2b81d0) at ../hw/display/qxl.c:2257
>>> #3  0x0000563896c7e8f2 in pci_qdev_realize (qdev=0x5638996e0e70, 
>>> errp=0x7ffd3c2b8250) at ../hw/pci/pci.c:2174
>>> #4  0x00005638970eb54b in device_set_realized (obj=0x5638996e0e70, 
>>> value=true, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:494
>>> #5  0x00005638970f5e14 in property_set_bool (obj=0x5638996e0e70, 
>>> v=0x5638996f3770, name=0x56389759b141 "realized", 
>>> opaque=0x5638987893d0, errp=0x7ffd3c2b84e0)
>>>      at ../qom/object.c:2374
>>> #6  0x00005638970f39f8 in object_property_set (obj=0x5638996e0e70, 
>>> name=0x56389759b141 "realized", v=0x5638996f3770, errp=0x7ffd3c2b84e0)
>>>      at ../qom/object.c:1449
>>> #7  0x00005638970f8586 in object_property_set_qobject 
>>> (obj=0x5638996e0e70, name=0x56389759b141 "realized", 
>>> value=0x5638996df900, errp=0x7ffd3c2b84e0)
>>>      at ../qom/qom-qobject.c:28
>>> #8  0x00005638970f3d8d in object_property_set_bool 
>>> (obj=0x5638996e0e70, name=0x56389759b141 "realized", value=true, 
>>> errp=0x7ffd3c2b84e0)
>>>      at ../qom/object.c:1519
>>> #9  0x00005638970eacb0 in qdev_realize (dev=0x5638996e0e70, 
>>> bus=0x563898cf3c20, errp=0x7ffd3c2b84e0) at ../hw/core/qdev.c:276
>>> #10 0x0000563896dba675 in qdev_device_add_from_qdict 
>>> (opts=0x5638996dfe50, from_json=false, errp=0x7ffd3c2b84e0) at 
>>> ../system/qdev-monitor.c:714
>>> #11 0x0000563896dba721 in qdev_device_add (opts=0x563898786150, 
>>> errp=0x56389855dc40 <error_fatal>) at ../system/qdev-monitor.c:733
>>> #12 0x0000563896dc48f1 in device_init_func (opaque=0x0, 
>>> opts=0x563898786150, errp=0x56389855dc40 <error_fatal>) at 
>>> ../system/vl.c:1207
>>> #13 0x000056389737a6cc in qemu_opts_foreach
>>>      (list=0x563898427b60 <qemu_device_opts>, func=0x563896dc48ca 
>>> <device_init_func>, opaque=0x0, errp=0x56389855dc40 <error_fatal>)
>>>      at ../util/qemu-option.c:1135
>>> #14 0x0000563896dc89b5 in qemu_create_cli_devices () at 
>>> ../system/vl.c:2745
>>> #15 0x0000563896dc8c00 in qmp_x_exit_preconfig (errp=0x56389855dc40 
>>> <error_fatal>) at ../system/vl.c:2806
>>> #16 0x0000563896dcb5de in qemu_init (argc=33, argv=0x7ffd3c2b8948) 
>>> at ../system/vl.c:3838
>>> #17 0x0000563897297323 in main (argc=33, argv=0x7ffd3c2b8948) at 
>>> ../system/main.c:72
>>
>> So the attached adjusted version of your patch does seem to help.  At
>> least I can't reproduce the crash on my stand.
>
> Thanks for the stack trace; the calls to SPICE_RING_INIT in 
> init_qxl_ram are
> definitely harmful.  Try V2 of the patch, attached, which skips the lines
> of init_qxl_ram that modify guest memory.
>
>> I'm wondering, could it be useful to explicitly mark all the reused
>> memory regions readonly upon cpr-transfer, and then make them writable
>> back again after the migration is done?  That way we will be segfaulting
>> early on instead of debugging tricky memory corruptions.
>
> It's a useful debugging technique, but changing protection on a large 
> memory region
> can be too expensive for production due to TLB shootdowns.
>

Good point. Though we could move this code under non-default option to
avoid re-writing.

Den


  reply	other threads:[~2025-03-06 10:05 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-28 17:39 [BUG, RFC] cpr-transfer: qxl guest driver crashes after migration Andrey Drobyshev
2025-02-28 18:13 ` Steven Sistare
2025-02-28 18:20   ` Steven Sistare
2025-02-28 18:35     ` Andrey Drobyshev
2025-02-28 18:37       ` Andrey Drobyshev
2025-03-04 19:05         ` Steven Sistare
2025-03-05 16:50           ` Andrey Drobyshev
2025-03-05 21:19             ` Steven Sistare
2025-03-06  9:59               ` Denis V. Lunev [this message]
2025-03-06 15:16               ` Andrey Drobyshev
2025-03-06 15:52                 ` Denis V. Lunev
2025-03-06 16:13                   ` Steven Sistare
2025-03-07 21:00                     ` Steven Sistare

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab3e61fc-a79f-4c6e-b8e9-d7b501c84644@virtuozzo.com \
    --to=den@virtuozzo.com \
    --cc=andrey.drobyshev@virtuozzo.com \
    --cc=berrange@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=steven.sistare@oracle.com \
    --cc=william.roche@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).