From: Greg Kurz <groug@kaod.org>
To: Thomas Huth <thuth@redhat.com>
Cc: "Laurent Vivier" <lvivier@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"QEMU Developers" <qemu-devel@nongnu.org>,
"qemu-ppc@nongnu.org" <qemu-ppc@nongnu.org>,
"Cédric Le Goater" <clg@kaod.org>,
"David Gibson" <dgibson@redhat.com>
Subject: Re: qemu-system-ppc64 abort()s with pcie bridges
Date: Thu, 9 Jul 2020 11:28:01 +0200 [thread overview]
Message-ID: <20200709112801.0f4cae40@bahia.lan> (raw)
In-Reply-To: <20200708115703.7926205a@bahia.lan>
On Wed, 8 Jul 2020 11:57:03 +0200
Greg Kurz <groug@kaod.org> wrote:
> On Wed, 8 Jul 2020 10:03:47 +0200
> Thomas Huth <thuth@redhat.com> wrote:
>
> >
> > Hi,
> >
> > qemu-system-ppc64 currently abort()s when it is started with a pcie
> > bridge device:
> >
> > $ qemu-system-ppc64 -M pseries-5.1 -device pcie-pci-bridge
> > Unexpected error in object_property_find() at qom/object.c:1240:
> > qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' not found
> > Aborted (core dumped)
> >
> > or:
> >
> > $ qemu-system-ppc64 -M pseries -device dec-21154-p2p-bridge
> > Unexpected error in object_property_find() at qom/object.c:1240:
> > qemu-system-ppc64: -device dec-21154-p2p-bridge: Property '.chassis_nr'
> > not found
> > Aborted (core dumped)
> >
> > That's kind of ugly, and it shows up as error when running
> > scripts/device-crash-test. Is there an easy way to avoid the abort() and
> > fail more gracefully here?
> >
>
> And even worse, this can tear down a running guest with hotplug :\
>
> (qemu) device_add pcie-pci-bridge
> Unexpected error in object_property_find() at /home/greg/Work/qemu/qemu-ppc/qom/object.c:1240:
> Property '.chassis_nr' not found
> Aborted (core dumped)
>
> This is caused by recent commit:
>
> commit 7ef1553dac8ef8dbe547b58d7420461a16be0eeb
> Author: Markus Armbruster <armbru@redhat.com>
> Date: Tue May 5 17:29:25 2020 +0200
>
> spapr_pci: Drop some dead error handling
>
> chassis_from_bus() uses object_property_get_uint() to get property
> "chassis_nr" of the bridge device. Failure would be a programming
> error. Pass &error_abort, and simplify its callers.
>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: qemu-ppc@nongnu.org
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> Acked-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Greg Kurz <groug@kaod.org>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> Message-Id: <20200505152926.18877-18-armbru@redhat.com>
>
> Before that, we would simply print the "chassir_nr not found" error,
> and in case of a cold plugged device exit.
>
> The root cause is that the sPAPR PCI code assumes that a PCI bridge
> has a "chassir_nr" property, ie. it is a standard PCI bridge. Other
> PCI bridge types don't have that. Not sure yet why this information
> is required, I'll check LoPAPR.
>
More on this side : each slot of a PCI bridge is associated a DRC (a
PAPR thingy to handle hot plug/unplug). Each DRC must have a unique
identifier system-wide. We used to use the bus number to compute
the DRC id but it was broken, so we now _hijack_ "chassis_nr" as an
alternative since this commit:
commit 05929a6c5dfe1028ef66250b7bbf11939f8e77cd
Author: David Gibson <david@gibson.dropbear.id.au>
Date: Wed Apr 10 11:49:28 2019 +1000
spapr: Don't use bus number for building DRC ids
This means that we only support the standard pci-bridge device,
and this relies on the availability of "chassis_nr". Failure
to find this property is then not a programming error, but
an expected case where we want to fail gracefully (ie. revert
Markus's commit mentioned above).
While reading code I realized that we have another problem : the
realization of the pci-bridge device does fail if "chassis_nr" is
zero, but I failed to find a uniqueness check. And we get:
$ qemu-system-ppc64 -device pci-bridge,chassis_nr=1 -device pci-bridge,chassis_nr=1
Unexpected error in object_property_try_add() at qom/object.c:1167:
qemu-system-ppc64: -device pci-bridge,chassis_nr=1: attempt to add duplicate property '40000100' to object (type 'container')
Aborted (core dumped)
It is very confusing to see that we state that "chassis_nr" is unique
several times in slotid_cap_init() but it is never enforced anywhere.
if (!chassis) {
error_setg(errp, "Bridge chassis not specified. Each bridge is required"
" to be assigned a unique chassis id > 0.");
return -EINVAL;
}
or
/* We make each chassis unique, this way each bridge is First in Chassis */
Michael, Marcel or anyone with PCI knowledge,
Can you shed some light on the semantics of "chassis_nr" ?
> In the meantime, since we're in soft freeze, I guess we should
> revert Markus's patch and add a big fat comment to explain
> what's going on and maybe change the error message to something
> more informative, eg. "PCIE-to-PCI bridges are not supported".
>
> Thoughts ?
>
> > Thomas
> >
>
prev parent reply other threads:[~2020-07-09 9:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-08 8:03 qemu-system-ppc64 abort()s with pcie bridges Thomas Huth
2020-07-08 9:57 ` Greg Kurz
2020-07-09 9:28 ` Greg Kurz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200709112801.0f4cae40@bahia.lan \
--to=groug@kaod.org \
--cc=armbru@redhat.com \
--cc=clg@kaod.org \
--cc=dgibson@redhat.com \
--cc=lvivier@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).