* [Qemu-devel] [PATCH] vfio/pci: do not set the PCIDevice 'has_rom' attribute
@ 2018-07-06 16:36 Cédric Le Goater
2018-07-06 17:16 ` Alex Williamson
2018-07-06 17:17 ` Michael S. Tsirkin
0 siblings, 2 replies; 5+ messages in thread
From: Cédric Le Goater @ 2018-07-06 16:36 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Peter Xu, Alex Williamson, Michael S. Tsirkin,
Marcel Apfelbaum, Cédric Le Goater
PCI devices needing a ROM allocate an optional MemoryRegion with
pci_add_option_rom(). pci_del_option_rom() does the cleanup when the
device is destroyed. The only action taken by this routine is to call
vmstate_unregister_ram() which clears the id string of the optional
ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This
was recently added by commit b895de502717 ("migration: discard
non-migratable RAMBlocks"), .
VFIO devices do their own loading of the PCI option ROM in
vfio_pci_size_rom(). The memory region is switched to an I/O region
and the PCI attribute 'has_rom' is set but the RAMBlock of the ROM
region is not allocated. When the associated PCI device is deleted,
pci_del_option_rom() calls vmstate_unregister_ram() which tries to
flag a NULL RAMBlock, leading to a SEGV.
It seems that 'has_rom' was set to have memory_region_destroy()
called, but since commit 469b046ead06 ("memory: remove
memory_region_destroy") this is not necessary anymore as the
MemoryRegion is freed automagically.
Remove the PCIDevice 'has_rom' attribute setting in vfio.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
Tested on a KVM POWER9 pseries machine and a Mellanox MT27710
Ethernet controller. Performed a couple of plug/unplug, migrated, and
did a couple more unplug/plug before powering off.
The same tests were done with the previous patches which were
addressing the issue at a different level :
1. [PATCH] exec.c: check RAMBlock validity before changing its flag
https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00009.html
2. [PATCH] pci: remove pci_del_option_rom()
https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg01651.html
Do we still want to remove pci_del_option_rom() ?
I caught this bug while deleting a passthrough device from a pseries
machine. Here is the stack:
#0 qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
#1 0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
#2 0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
#3 pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
#4 0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
#5 0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>,
#6 0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110,
#7 0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>,
#8 0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>,
#9 0x00000001005fcdd8 in device_unparent (obj=0x101796330)
#10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>,
#11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330,
#12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
#13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
#14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
#15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
#16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
hw/vfio/pci.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index a1577dea7fdb..6cbb8fa0549d 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -990,7 +990,6 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
pci_register_bar(&vdev->pdev, PCI_ROM_SLOT,
PCI_BASE_ADDRESS_SPACE_MEMORY, &vdev->pdev.rom);
- vdev->pdev.has_rom = true;
vdev->rom_read_failed = false;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH] vfio/pci: do not set the PCIDevice 'has_rom' attribute
2018-07-06 16:36 [Qemu-devel] [PATCH] vfio/pci: do not set the PCIDevice 'has_rom' attribute Cédric Le Goater
@ 2018-07-06 17:16 ` Alex Williamson
2018-07-09 7:04 ` Cédric Le Goater
2018-07-06 17:17 ` Michael S. Tsirkin
1 sibling, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2018-07-06 17:16 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Paolo Bonzini, Peter Xu, Michael S. Tsirkin,
Marcel Apfelbaum
On Fri, 6 Jul 2018 18:36:14 +0200
Cédric Le Goater <clg@kaod.org> wrote:
> PCI devices needing a ROM allocate an optional MemoryRegion with
> pci_add_option_rom(). pci_del_option_rom() does the cleanup when the
> device is destroyed. The only action taken by this routine is to call
> vmstate_unregister_ram() which clears the id string of the optional
> ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This
> was recently added by commit b895de502717 ("migration: discard
> non-migratable RAMBlocks"), .
>
> VFIO devices do their own loading of the PCI option ROM in
> vfio_pci_size_rom(). The memory region is switched to an I/O region
> and the PCI attribute 'has_rom' is set but the RAMBlock of the ROM
> region is not allocated. When the associated PCI device is deleted,
> pci_del_option_rom() calls vmstate_unregister_ram() which tries to
> flag a NULL RAMBlock, leading to a SEGV.
>
> It seems that 'has_rom' was set to have memory_region_destroy()
> called, but since commit 469b046ead06 ("memory: remove
> memory_region_destroy") this is not necessary anymore as the
> MemoryRegion is freed automagically.
>
> Remove the PCIDevice 'has_rom' attribute setting in vfio.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
I think the segfault can be attributed to:
fa53a0e53efd ("memory: drop find_ram_block()")
Prior to that vmstate_unregister_ram() called
memory_region_get_ram_addr() which would have resulted in
RAM_ADDR_INVALID. This would have been passed to
qemu_ram_unset_idstr() which would have used find_ram_block() to lookup
the RAMBlock, which would be NULL for the invalid address, safely
avoiding any sort of segfault.
TL;DR, I'll add the above commit with a Fixes: tag for stable and
downstream releases, looks good otherwise. Thanks,
Alex
> ---
>
> Tested on a KVM POWER9 pseries machine and a Mellanox MT27710
> Ethernet controller. Performed a couple of plug/unplug, migrated, and
> did a couple more unplug/plug before powering off.
>
> The same tests were done with the previous patches which were
> addressing the issue at a different level :
>
> 1. [PATCH] exec.c: check RAMBlock validity before changing its flag
> https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00009.html
>
> 2. [PATCH] pci: remove pci_del_option_rom()
> https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg01651.html
>
> Do we still want to remove pci_del_option_rom() ?
>
> I caught this bug while deleting a passthrough device from a pseries
> machine. Here is the stack:
>
> #0 qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
> #1 0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
> #2 0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
> #3 pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
> #4 0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
> #5 0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>,
> #6 0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110,
> #7 0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>,
> #8 0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>,
> #9 0x00000001005fcdd8 in device_unparent (obj=0x101796330)
> #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>,
> #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330,
> #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
> #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
> #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
> #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
> #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
>
> hw/vfio/pci.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index a1577dea7fdb..6cbb8fa0549d 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -990,7 +990,6 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
> pci_register_bar(&vdev->pdev, PCI_ROM_SLOT,
> PCI_BASE_ADDRESS_SPACE_MEMORY, &vdev->pdev.rom);
>
> - vdev->pdev.has_rom = true;
> vdev->rom_read_failed = false;
> }
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH] vfio/pci: do not set the PCIDevice 'has_rom' attribute
2018-07-06 17:16 ` Alex Williamson
@ 2018-07-09 7:04 ` Cédric Le Goater
2018-07-09 14:30 ` Alex Williamson
0 siblings, 1 reply; 5+ messages in thread
From: Cédric Le Goater @ 2018-07-09 7:04 UTC (permalink / raw)
To: Alex Williamson
Cc: qemu-devel, Paolo Bonzini, Peter Xu, Michael S. Tsirkin,
Marcel Apfelbaum
On 07/06/2018 07:16 PM, Alex Williamson wrote:
> On Fri, 6 Jul 2018 18:36:14 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
>
>> PCI devices needing a ROM allocate an optional MemoryRegion with
>> pci_add_option_rom(). pci_del_option_rom() does the cleanup when the
>> device is destroyed. The only action taken by this routine is to call
>> vmstate_unregister_ram() which clears the id string of the optional
>> ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This
>> was recently added by commit b895de502717 ("migration: discard
>> non-migratable RAMBlocks"), .
>>
>> VFIO devices do their own loading of the PCI option ROM in
>> vfio_pci_size_rom(). The memory region is switched to an I/O region
>> and the PCI attribute 'has_rom' is set but the RAMBlock of the ROM
>> region is not allocated. When the associated PCI device is deleted,
>> pci_del_option_rom() calls vmstate_unregister_ram() which tries to
>> flag a NULL RAMBlock, leading to a SEGV.
>>
>> It seems that 'has_rom' was set to have memory_region_destroy()
>> called, but since commit 469b046ead06 ("memory: remove
>> memory_region_destroy") this is not necessary anymore as the
>> MemoryRegion is freed automagically.
>>
>> Remove the PCIDevice 'has_rom' attribute setting in vfio.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>
> I think the segfault can be attributed to:
>
> fa53a0e53efd ("memory: drop find_ram_block()")
>
> Prior to that vmstate_unregister_ram() called
> memory_region_get_ram_addr() which would have resulted in
> RAM_ADDR_INVALID. This would have been passed to
> qemu_ram_unset_idstr() which would have used find_ram_block() to lookup
> the RAMBlock, which would be NULL for the invalid address, safely
> avoiding any sort of segfault.
Yes, but since, commit b895de502717 ("migration: discard non-migratable
RAMBlocks") added :
void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
{
qemu_ram_unset_idstr(mr->ram_block);
+ qemu_ram_unset_migratable(mr->ram_block);
}
and qemu_ram_unset_migratable() does not check the block validity.
C.
> TL;DR, I'll add the above commit with a Fixes: tag for stable and
> downstream releases, looks good otherwise. Thanks,
>
> Alex
>
>> ---
>>
>> Tested on a KVM POWER9 pseries machine and a Mellanox MT27710
>> Ethernet controller. Performed a couple of plug/unplug, migrated, and
>> did a couple more unplug/plug before powering off.
>>
>> The same tests were done with the previous patches which were
>> addressing the issue at a different level :
>>
>> 1. [PATCH] exec.c: check RAMBlock validity before changing its flag
>> https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00009.html
>>
>> 2. [PATCH] pci: remove pci_del_option_rom()
>> https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg01651.html
>>
>> Do we still want to remove pci_del_option_rom() ?
>>
>> I caught this bug while deleting a passthrough device from a pseries
>> machine. Here is the stack:
>>
>> #0 qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
>> #1 0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
>> #2 0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
>> #3 pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
>> #4 0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
>> #5 0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>,
>> #6 0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110,
>> #7 0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>,
>> #8 0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>,
>> #9 0x00000001005fcdd8 in device_unparent (obj=0x101796330)
>> #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>,
>> #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330,
>> #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
>> #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
>> #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
>> #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
>> #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
>>
>> hw/vfio/pci.c | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index a1577dea7fdb..6cbb8fa0549d 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -990,7 +990,6 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
>> pci_register_bar(&vdev->pdev, PCI_ROM_SLOT,
>> PCI_BASE_ADDRESS_SPACE_MEMORY, &vdev->pdev.rom);
>>
>> - vdev->pdev.has_rom = true;
>> vdev->rom_read_failed = false;
>> }
>>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH] vfio/pci: do not set the PCIDevice 'has_rom' attribute
2018-07-09 7:04 ` Cédric Le Goater
@ 2018-07-09 14:30 ` Alex Williamson
0 siblings, 0 replies; 5+ messages in thread
From: Alex Williamson @ 2018-07-09 14:30 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Paolo Bonzini, Peter Xu, Michael S. Tsirkin,
Marcel Apfelbaum
On Mon, 9 Jul 2018 09:04:47 +0200
Cédric Le Goater <clg@kaod.org> wrote:
> On 07/06/2018 07:16 PM, Alex Williamson wrote:
> > On Fri, 6 Jul 2018 18:36:14 +0200
> > Cédric Le Goater <clg@kaod.org> wrote:
> >
> >> PCI devices needing a ROM allocate an optional MemoryRegion with
> >> pci_add_option_rom(). pci_del_option_rom() does the cleanup when the
> >> device is destroyed. The only action taken by this routine is to call
> >> vmstate_unregister_ram() which clears the id string of the optional
> >> ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This
> >> was recently added by commit b895de502717 ("migration: discard
> >> non-migratable RAMBlocks"), .
> >>
> >> VFIO devices do their own loading of the PCI option ROM in
> >> vfio_pci_size_rom(). The memory region is switched to an I/O region
> >> and the PCI attribute 'has_rom' is set but the RAMBlock of the ROM
> >> region is not allocated. When the associated PCI device is deleted,
> >> pci_del_option_rom() calls vmstate_unregister_ram() which tries to
> >> flag a NULL RAMBlock, leading to a SEGV.
> >>
> >> It seems that 'has_rom' was set to have memory_region_destroy()
> >> called, but since commit 469b046ead06 ("memory: remove
> >> memory_region_destroy") this is not necessary anymore as the
> >> MemoryRegion is freed automagically.
> >>
> >> Remove the PCIDevice 'has_rom' attribute setting in vfio.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >
> > I think the segfault can be attributed to:
> >
> > fa53a0e53efd ("memory: drop find_ram_block()")
> >
> > Prior to that vmstate_unregister_ram() called
> > memory_region_get_ram_addr() which would have resulted in
> > RAM_ADDR_INVALID. This would have been passed to
> > qemu_ram_unset_idstr() which would have used find_ram_block() to lookup
> > the RAMBlock, which would be NULL for the invalid address, safely
> > avoiding any sort of segfault.
>
> Yes, but since, commit b895de502717 ("migration: discard non-migratable
> RAMBlocks") added :
>
> void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
> {
> qemu_ram_unset_idstr(mr->ram_block);
> + qemu_ram_unset_migratable(mr->ram_block);
> }
>
> and qemu_ram_unset_migratable() does not check the block validity.
Ok, yes I see that qemu_ram_unset_idstr() does avoid the NULL pointer
dereference, so I'll make the fixes tag reference to b895de502717.
Thanks,
Alex
> >> ---
> >>
> >> Tested on a KVM POWER9 pseries machine and a Mellanox MT27710
> >> Ethernet controller. Performed a couple of plug/unplug, migrated, and
> >> did a couple more unplug/plug before powering off.
> >>
> >> The same tests were done with the previous patches which were
> >> addressing the issue at a different level :
> >>
> >> 1. [PATCH] exec.c: check RAMBlock validity before changing its flag
> >> https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00009.html
> >>
> >> 2. [PATCH] pci: remove pci_del_option_rom()
> >> https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg01651.html
> >>
> >> Do we still want to remove pci_del_option_rom() ?
> >>
> >> I caught this bug while deleting a passthrough device from a pseries
> >> machine. Here is the stack:
> >>
> >> #0 qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
> >> #1 0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
> >> #2 0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
> >> #3 pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
> >> #4 0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
> >> #5 0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>,
> >> #6 0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110,
> >> #7 0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>,
> >> #8 0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>,
> >> #9 0x00000001005fcdd8 in device_unparent (obj=0x101796330)
> >> #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>,
> >> #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330,
> >> #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
> >> #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
> >> #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
> >> #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
> >> #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
> >>
> >> hw/vfio/pci.c | 1 -
> >> 1 file changed, 1 deletion(-)
> >>
> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >> index a1577dea7fdb..6cbb8fa0549d 100644
> >> --- a/hw/vfio/pci.c
> >> +++ b/hw/vfio/pci.c
> >> @@ -990,7 +990,6 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
> >> pci_register_bar(&vdev->pdev, PCI_ROM_SLOT,
> >> PCI_BASE_ADDRESS_SPACE_MEMORY, &vdev->pdev.rom);
> >>
> >> - vdev->pdev.has_rom = true;
> >> vdev->rom_read_failed = false;
> >> }
> >>
> >
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH] vfio/pci: do not set the PCIDevice 'has_rom' attribute
2018-07-06 16:36 [Qemu-devel] [PATCH] vfio/pci: do not set the PCIDevice 'has_rom' attribute Cédric Le Goater
2018-07-06 17:16 ` Alex Williamson
@ 2018-07-06 17:17 ` Michael S. Tsirkin
1 sibling, 0 replies; 5+ messages in thread
From: Michael S. Tsirkin @ 2018-07-06 17:17 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Paolo Bonzini, Peter Xu, Alex Williamson,
Marcel Apfelbaum
On Fri, Jul 06, 2018 at 06:36:14PM +0200, Cédric Le Goater wrote:
> PCI devices needing a ROM allocate an optional MemoryRegion with
> pci_add_option_rom(). pci_del_option_rom() does the cleanup when the
> device is destroyed. The only action taken by this routine is to call
> vmstate_unregister_ram() which clears the id string of the optional
> ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This
> was recently added by commit b895de502717 ("migration: discard
> non-migratable RAMBlocks"), .
>
> VFIO devices do their own loading of the PCI option ROM in
> vfio_pci_size_rom(). The memory region is switched to an I/O region
> and the PCI attribute 'has_rom' is set but the RAMBlock of the ROM
> region is not allocated. When the associated PCI device is deleted,
> pci_del_option_rom() calls vmstate_unregister_ram() which tries to
> flag a NULL RAMBlock, leading to a SEGV.
>
> It seems that 'has_rom' was set to have memory_region_destroy()
> called, but since commit 469b046ead06 ("memory: remove
> memory_region_destroy") this is not necessary anymore as the
> MemoryRegion is freed automagically.
>
> Remove the PCIDevice 'has_rom' attribute setting in vfio.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>
> Tested on a KVM POWER9 pseries machine and a Mellanox MT27710
> Ethernet controller. Performed a couple of plug/unplug, migrated, and
> did a couple more unplug/plug before powering off.
>
> The same tests were done with the previous patches which were
> addressing the issue at a different level :
>
> 1. [PATCH] exec.c: check RAMBlock validity before changing its flag
> https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00009.html
>
> 2. [PATCH] pci: remove pci_del_option_rom()
> https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg01651.html
>
> Do we still want to remove pci_del_option_rom() ?
>
> I caught this bug while deleting a passthrough device from a pseries
> machine. Here is the stack:
>
> #0 qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
> #1 0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
> #2 0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
> #3 pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
> #4 0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
> #5 0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>,
> #6 0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110,
> #7 0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>,
> #8 0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>,
> #9 0x00000001005fcdd8 in device_unparent (obj=0x101796330)
> #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>,
> #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330,
> #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
> #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
> #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
> #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
> #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
>
> hw/vfio/pci.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index a1577dea7fdb..6cbb8fa0549d 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -990,7 +990,6 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
> pci_register_bar(&vdev->pdev, PCI_ROM_SLOT,
> PCI_BASE_ADDRESS_SPACE_MEMORY, &vdev->pdev.rom);
>
> - vdev->pdev.has_rom = true;
> vdev->rom_read_failed = false;
> }
>
> --
> 2.17.1
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-07-09 14:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-06 16:36 [Qemu-devel] [PATCH] vfio/pci: do not set the PCIDevice 'has_rom' attribute Cédric Le Goater
2018-07-06 17:16 ` Alex Williamson
2018-07-09 7:04 ` Cédric Le Goater
2018-07-09 14:30 ` Alex Williamson
2018-07-06 17:17 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).