qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Cédric Le Goater" <clg@kaod.org>,
	qemu-devel@nongnu.org,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
Date: Wed, 4 Jul 2018 17:55:20 +0800	[thread overview]
Message-ID: <20180704095520.GD2568@xz-mi> (raw)
In-Reply-To: <95fe81b1-13ee-5c82-22e2-e5ef4abd1168@redhat.com>

On Wed, Jul 04, 2018 at 11:34:55AM +0200, Paolo Bonzini wrote:
> On 04/07/2018 08:42, Cédric Le Goater wrote:
> > On 07/04/2018 04:26 AM, Peter Xu wrote:
> >> On Tue, Jul 03, 2018 at 02:45:24PM +0200, Cédric Le Goater wrote:
> >>> On 07/02/2018 05:57 AM, Peter Xu wrote:
> >>>> On Sun, Jul 01, 2018 at 07:19:53PM +0200, Cédric Le Goater wrote:
> >>>>> When a PCI device is unplugged, the PCI memory regions are deleted
> >>>>> before the optional ROM RAMBlock is flagged non-migratable. But, when
> >>>>> this is done, the RAMBlock has already been cleared from the region,
> >>>>> leading to a segv.
> >>>>>
> >>>>> Fix the issue by testing the RAMBlock before flagging it, as it is
> >>>>> done in qemu_ram_unset_idstr()
> >>>>>
> >>>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >>>>> ---
> >>>>>
> >>>>>  I caught this bug while deleting a passthrough device from a pseries
> >>>>>  machine. Here is the stack:
> >>>>>    
> >>>>>    #0  qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
> >>>>>    #1  0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
> >>>>>    #2  0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
> >>>>>    #3  pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
> >>>>>    #4  0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
> >>>>>    #5  0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>, 
> >>>>>    #6  0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110, 
> >>>>>    #7  0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>, 
> >>>>>    #8  0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>, 
> >>>>>    #9  0x00000001005fcdd8 in device_unparent (obj=0x101796330)
> >>>>>    #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>, 
> >>>>>    #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330, 
> >>>>>    #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
> >>>>>    #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
> >>>>>    #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
> >>>>>    #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
> >>>>>    #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
> >>>>>
> >>>>>  May be we should call pci_del_option_rom() before
> >>>>>  pci_unregister_io_regions() ?
> >>>>
> >>>> This seems to make more sense to me.
> >>>>
> >>>> Meanwhile I assume the name pci_del_option_rom() is a bit misleading -
> >>>> it's not really deleting the ROM but unregistering the ROM only.
> >>>> Instead IIUC it's pci_unregister_io_regions() which deleted that. So
> >>>> maybe we can either rename the function pci_del_option_rom(), or we
> >>>> can pick the ROM destruction out of pci_unregister_io_regions() and
> >>>> put it into pci_del_option_rom() to make sure it's done as the last
> >>>> step?
> >>>
> >>> So it is a little more complex than I thought. 
> >>>
> >>> The PCI device is a vfio PCI device and the PCI ROM region is initialized 
> >>> in vfio_pci_size_rom() with memory_region_init_io(), which does not 
> >>> allocate the RAMBlock, but has_rom is still set to true. 
> >>>
> >>> When the device is deleted, pci_del_option_rom() is called and with it, vmstate_unregister_ram() because has_rom is set to true. Leading to the
> >>> SEGV.
> >>>
> >>> I am not sure how to handle this case. It seems that the realize routine 
> >>> of VFIOPCIDevice is hijacking a little the PCIDevice layer.
> >>
> >> Indeed.
> >>
> >> Then now I'm a bit confused on who actually deleted the ROM memory
> >> region that was created when pci_add_option_rom() was called.  It
> >> seems to be leaked.
> >>
> >> AFAIU the rest of the memory regions of the BARs (0-5) are managed by
> >> specific device emulation code, however this ROM memory region is
> >> managed by PCI subsystem.  Not sure whether that means we should
> >> destroy the region in PCI subsystem too, e.g. in pci_del_option_rom().
> >>
> >> And now I see this patch might be a valid fix for the VFIO-specific
> >> issue (though we might comment that a bit somewhere).
> > 
> > yes. I will send a v2 with an updated commit log.
> 
> I wonder if the fix is simply to... get rid of vmstate_unregister_ram.
> 
> It was added in
> 
>     commit b0e56e0b63f350691b52d3e75e89bb64143fbeff
>     Author: Hu Tao <hutao@cn.fujitsu.com>
>     Date:   Wed Apr 2 15:13:27 2014 +0800
> 
>     unset RAMBlock idstr when unregister MemoryRegion
> 
>     Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> whose commit message is a bit lacking, but
> http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html helps
> more.  It seems like the original bug was a reference count issue.
> 
> Clearing the new migratable flag should also be unnecessary.

But even if we get rid of vmstate_unregister_ram(), the leak could
still be there?

I'm not sure what was leaked when b0e56e0b6 was introduced, I feel
like it's the RAMBlock of the memdev.  Here I think the ROM memory
region seems to be leaked as well (along with the RAMBlock inside)?

Regards,

-- 
Peter Xu

  reply	other threads:[~2018-07-04  9:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-01 17:19 [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag Cédric Le Goater
2018-07-02  3:57 ` Peter Xu
2018-07-03 12:45   ` Cédric Le Goater
2018-07-04  2:26     ` Peter Xu
2018-07-04  6:42       ` Cédric Le Goater
2018-07-04  9:34         ` Paolo Bonzini
2018-07-04  9:55           ` Peter Xu [this message]
2018-07-04 12:16             ` Paolo Bonzini
2018-07-05  5:56               ` Cédric Le Goater
2018-07-05 12:01                 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180704095520.GD2568@xz-mi \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=clg@kaod.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=dgilbert@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).