qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Huth <thuth@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	ehabkost@redhat.com,
	Peter Crosthwaite <crosthwaite.peter@gmail.com>,
	qemu-devel@nongnu.org, qemu-stable@nongnu.org,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Alexander Graf <agraf@suse.de>,
	qemu-ppc@nongnu.org, Antony Pavlov <antonynpavlov@gmail.com>,
	stefanha@redhat.com, Cornelia Huck <cornelia.huck@de.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Alistair Francis <alistair.francis@xilinx.com>,
	afaerber@suse.de, Li Guang <lig.fnst@cn.fujitsu.com>,
	Richard Henderson <rth@twiddle.net>
Subject: Re: [Qemu-devel] [PATCH v3 6/7] qdev: Protect device-list-properties against broken devices
Date: Mon, 28 Sep 2015 11:17:57 +0200	[thread overview]
Message-ID: <560905C5.2030209@redhat.com> (raw)
In-Reply-To: <87io6vm08l.fsf@blackfin.pond.sub.org>

On 28/09/15 10:11, Markus Armbruster wrote:
> Thomas Huth <thuth@redhat.com> writes:
> 
>> On 25/09/15 16:17, Markus Armbruster wrote:
>>> Thomas Huth <thuth@redhat.com> writes:
>>>
>>>> On 24/09/15 20:57, Markus Armbruster wrote:
>>>>> Several devices don't survive object_unref(object_new(T)): they crash
>>>>> or hang during cleanup, or they leave dangling pointers behind.
>>>>>
>>>>> This breaks at least device-list-properties, because
>>>>> qmp_device_list_properties() needs to create a device to find its
>>>>> properties.  Broken in commit f4eb32b "qmp: show QOM properties in
>>>>> device-list-properties", v2.1.  Example reproducer:
>>>>>
>>>>>     $ qemu-system-aarch64 -nodefaults -display none -machine none
>>>>> -S -qmp stdio
>>>>>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 4,
>>>>> "major": 2}, "package": ""}, "capabilities": []}}
>>>>>     { "execute": "qmp_capabilities" }
>>>>>     {"return": {}}
>>>>>     { "execute": "device-list-properties", "arguments": {
>>>>> "typename": "pxa2xx-pcmcia" } }
>>>>>     qemu-system-aarch64: /home/armbru/work/qemu/memory.c:1307:
>>>>> memory_region_finalize: Assertion `((&mr->subregions)->tqh_first
>>>>> == ((void *)0))' failed.
>>>>>     Aborted (core dumped)
>>>>>     [Exit 134 (SIGABRT)]
>>>>>
>>>>> Unfortunately, I can't fix the problems in these devices right now.
>>>>> Instead, add DeviceClass member cannot_even_create_with_object_new_yet
>>>>> to mark them:
>> ...
>>>>>  static void pxa2xx_pcmcia_register_types(void)
>>>>> diff --git a/hw/ppc/spapr_rng.c b/hw/ppc/spapr_rng.c
>>>>> index ed43d5e..e1b115d 100644
>>>>> --- a/hw/ppc/spapr_rng.c
>>>>> +++ b/hw/ppc/spapr_rng.c
>>>>> @@ -169,6 +169,11 @@ static void spapr_rng_class_init(ObjectClass *oc, void *data)
>>>>>      dc->realize = spapr_rng_realize;
>>>>>      set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>>>>>      dc->props = spapr_rng_properties;
>>>>> +
>>>>> +    /*
>>>>> +     * Reason: crashes device-introspect-test for unknown reason.
>>>>> +     */
>>>>> +    dc->cannot_even_create_with_object_new_yet = true;
>>>>>  }
>>>>
>>>> Please don't do that! That breaks the help output from
>>>> "-device spapr-rng,?" which should help the user to see how to use this
>>>> device!
>>>
>>> Well, device-introspection-test makes qemu crash, with the backtrace
>>> pointing squarely to this device.  Stands to reason that device
>>> introspection could crash in normal usage, too.  Until the crash is
>>> debugged, we better disable introspection of this device.
>>>
>>> I quite agree that disabling introspection hurts users.  Just not as
>>> much as crashes :)
>>>
>>>> I tried to debug why this device breaks the test, but the test
>>>> environment is giving me a hard time ... how do you best hook a gdb into
>>>> that framework, so you can trace such problems?
>>>> Anyway, with some trial and error, I found out that it seems like the
>>>>
>>>>   object_resolve_path_type("", TYPE_SPAPR_RNG, NULL)
>>>>
>>>> in spapr_rng_instance_init() is causing the problems. Could it be that
>>>> object_resolve_path_type is not working with the test environment?
>>>
>>> I tried to figure out why this device breaks under this test, but
>>> couldn't, so I posted with the "for unknown reason" comment.
>>
>> I've debugged this now for a while (thanks for the tip with
>> MALLOC_PERTURB, by the way!) and it seems to me that the problem is in
>> the macio object than in spapr-rng - the latter is just the victim of
>> some memory corruption caused by the first one: The
>> object_resolve_path_type() crashes while trying to go through the macio
>> object.
>>
>> So could you please add the "dc->cannot_even_create_with_object_new_yet
>> = true;" to macio_class_init() instead? ... that seems to fix the crash
>> for me, too, and is likely the better place.
> 
> Hmm.
> 
> For most of the devices my patch marks, we have a pretty good idea on
> what's wrong with them.  spapr-rng is among the exceptions.  You believe
> it's actually "the macio object".  Which one?  "macio" is abstract...
> 
> You report introspecting "spapr-rng" crashes "while trying to go through
> the macio object".  I wonder how omitting introspection of macio objects
> (that's what marking them does to this test) could affect the object
> we're going through when we crash.

I have to correct myself: It's not going through the macio object, the
problem is actually the "macio[0]" property that is created during
memory_region_init() with object_property_add_child() ... the property
points to a free()d object when the crash happens.

>> Or maybe we could get this also fixed? The problem could be the
>> memory_region_init(&s->bar, NULL, "macio", 0x80000) in
>> macio_instance_init() ... is this ok here? Or does this rather have to
>> go to the realize() function instead?
> 
> Hmm, does creating and destroying a macio object leave the memory region
> behind?
> 
> Paolo, is calling memory_region_init() in an instance_init() method
> okay?

As Paolo mentioned, we likely need to pass an "owner" to
memory_region_init() or the macio memory region will get attached to
"/unattached" instead - and then leave a dangling link property behind
when the original macio object got destroyed.

By the way, there are some more spots like this in the code, e.g. in
pxa2xx_fir_instance_init() in hw/arm/pxa2xx.c ...

 Thomas

  parent reply	other threads:[~2015-09-28  9:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-24 18:57 [Qemu-devel] [PATCH v3 0/7] Fix device introspection regressions Markus Armbruster
2015-09-24 18:57 ` [Qemu-devel] [PATCH v3 1/7] tests: Fix how qom-test is run Markus Armbruster
2015-09-24 18:57 ` [Qemu-devel] [PATCH v3 2/7] libqtest: Clean up unused QTestState member sigact_old Markus Armbruster
2015-09-24 18:57 ` [Qemu-devel] [PATCH v3 3/7] libqtest: New hmp() & friends Markus Armbruster
2015-09-24 18:57 ` [Qemu-devel] [PATCH v3 4/7] device-introspect-test: New, covering device introspection Markus Armbruster
2015-09-25 10:17   ` Thomas Huth
2015-09-25 10:18     ` Andreas Färber
2015-09-25 14:13       ` Markus Armbruster
2015-09-24 18:57 ` [Qemu-devel] [PATCH v3 5/7] qmp: Fix device-list-properties not to crash for abstract device Markus Armbruster
2015-09-24 18:57 ` [Qemu-devel] [PATCH v3 6/7] qdev: Protect device-list-properties against broken devices Markus Armbruster
2015-09-24 19:25   ` Eduardo Habkost
2015-09-25  6:07     ` Markus Armbruster
2015-09-25 13:38   ` Thomas Huth
2015-09-25 14:17     ` Markus Armbruster
2015-09-25 18:21       ` Thomas Huth
2015-09-28  8:11         ` Markus Armbruster
2015-09-28  8:15           ` Andreas Färber
2015-09-28  8:38             ` Paolo Bonzini
2015-09-28  8:37           ` Paolo Bonzini
2015-09-28 14:17             ` Markus Armbruster
2015-09-28 14:25               ` Paolo Bonzini
2015-09-28  9:17           ` Thomas Huth [this message]
2015-09-28  9:30             ` Peter Maydell
2015-09-28 14:35             ` Markus Armbruster
2015-09-28 14:44               ` Peter Maydell
2015-09-28 19:36               ` Markus Armbruster
2015-09-28 19:40                 ` Peter Maydell
2015-09-29  8:05                   ` Markus Armbruster
2015-09-29 12:38                     ` Paolo Bonzini
2015-09-24 18:57 ` [Qemu-devel] [PATCH v3 7/7] Revert "qdev: Use qdev_get_device_class() for -device <type>, help" Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=560905C5.2030209@redhat.com \
    --to=thuth@redhat.com \
    --cc=afaerber@suse.de \
    --cc=agraf@suse.de \
    --cc=alistair.francis@xilinx.com \
    --cc=antonynpavlov@gmail.com \
    --cc=armbru@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=crosthwaite.peter@gmail.com \
    --cc=ehabkost@redhat.com \
    --cc=lig.fnst@cn.fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).