From: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
To: "Michael S. Tsirkin" <mst@redhat.com>,
Alex Williamson <alex.williamson@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>,
Zihan Yang <whois.zihan.yang@gmail.com>,
qemu-devel@nongnu.org, Igor Mammedov <imammedo@redhat.com>,
Eric Auger <eauger@redhat.com>, Drew Jones <drjones@redhat.com>,
Wei Huang <wei@redhat.com>
Subject: Re: [Qemu-devel] [RFC 3/3] acpi-build: allocate mcfg for multiple host bridges
Date: Wed, 23 May 2018 19:50:53 +0300 [thread overview]
Message-ID: <74728cc8-0e18-d344-8a88-cf54fd8dc95f@gmail.com> (raw)
In-Reply-To: <20180523171028-mutt-send-email-mst@kernel.org>
On 05/23/2018 05:25 PM, Michael S. Tsirkin wrote:
> On Tue, May 22, 2018 at 10:28:56PM -0600, Alex Williamson wrote:
>> On Wed, 23 May 2018 02:38:52 +0300
>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>>> On Tue, May 22, 2018 at 03:47:41PM -0600, Alex Williamson wrote:
>>>> On Wed, 23 May 2018 00:44:22 +0300
>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>>
>>>>> On Tue, May 22, 2018 at 03:36:59PM -0600, Alex Williamson wrote:
>>>>>> On Tue, 22 May 2018 23:58:30 +0300
>>>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>>>>> It's not hard to think of a use-case where >256 devices
>>>>>>> are helpful, for example a nested virt scenario where
>>>>>>> each device is passed on to a different nested guest.
>>>>>>>
>>>>>>> But I think the main feature this is needed for is numa modeling.
>>>>>>> Guests seem to assume a numa node per PCI root, ergo we need more PCI
>>>>>>> roots.
>>>>>> But even if we have NUMA affinity per PCI host bridge, a PCI host
>>>>>> bridge does not necessarily imply a new PCIe domain.
>>>>> What are you calling a PCIe domain?
>>>> Domain/segment
>>>>
>>>> 0000:00:00.0
>>>> ^^^^ This
>>> Right. So we can thinkably have PCIe root complexes share an ACPI segment.
>>> I don't see what this buys us by itself.
>> The ability to define NUMA locality for a PCI sub-hierarchy while
>> maintaining compatibility with non-segment aware OSes (and firmware).
> Fur sure, but NUMA is a kind of advanced topic, MCFG has been around for
> longer than various NUMA tables. Are there really non-segment aware
> guests that also know how to make use of NUMA?
>
Yes, the current pxb devices accomplish exactly that. Multiple NUMA nodes
while sharing PCI domain 0.
Thanks,
Marcel
>>>> Isn't that the only reason we'd need a new MCFG section and the reason
>>>> we're limited to 256 buses? Thanks,
>>>>
>>>> Alex
>>> I don't know whether a single MCFG section can describe multiple roots.
>>> I think it would be certainly unusual.
>> I'm not sure here if you're referring to the actual MCFG ACPI table or
>> the MMCONFIG range, aka the ECAM. Neither of these describe PCI host
>> bridges. The MCFG table can describe one or more ECAM ranges, which
>> provides the ECAM base address, the PCI segment associated with that
>> ECAM and the start and end bus numbers to know the offset and extent of
>> the ECAM range. PCI host bridges would then theoretically be separate
>> ACPI objects with _SEG and _BBN methods to associate them to the
>> correct ECAM range by segment number and base bus number. So it seems
>> that tooling exists that an ECAM/MMCONFIG range could be provided per
>> PCI host bridge, even if they exist within the same domain, but in
>> practice what I see on systems I have access to is a single MMCONFIG
>> range supporting all of the host bridges. It also seems there are
>> numerous ways to describe the MMCONFIG range and I haven't actually
>> found an example that seems to use the MCFG table. Two have MCFG
>> tables (that don't seem terribly complete) and the kernel claims to
>> find the MMCONFIG via e820, another doesn't even have an MCFG table and
>> the kernel claims to find MMCONFIG via an ACPI motherboard resource.
>> I'm not sure if I can enable PCI segments on anything to see how the
>> firmware changes. Thanks,
>>
>> Alex
> Let me clarify. So MCFG have base address allocation structures.
> Each maps a segment and a range of bus numbers into memory.
> This structure is what I meant.
>
> IIUC you are saying on your systems everything is within a single
> segment, right? Multiple pci hosts map into a single segment?
>
> If you do this you can do NUMA, but do not gain > 256 devices.
>
> Are we are the same page then?
>
next prev parent reply other threads:[~2018-05-23 16:51 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-20 7:28 [Qemu-devel] [RFC 0/3] pci_expander_brdige: Put pxb host bridge into separate pci domain Zihan Yang
2018-05-20 7:28 ` [Qemu-devel] [RFC 1/3] pci_expander_bridge: reserve enough mcfg space for pxb host Zihan Yang
2018-05-21 11:03 ` Marcel Apfelbaum
2018-05-22 5:59 ` Zihan Yang
2018-05-22 18:47 ` Marcel Apfelbaum
2018-05-20 7:28 ` [Qemu-devel] [RFC 2/3] pci: Link pci_host_bridges with QTAILQ Zihan Yang
2018-05-21 11:05 ` Marcel Apfelbaum
2018-05-22 5:59 ` Zihan Yang
2018-05-22 18:39 ` Marcel Apfelbaum
2018-05-20 7:28 ` [Qemu-devel] [RFC 3/3] acpi-build: allocate mcfg for multiple host bridges Zihan Yang
2018-05-21 11:53 ` Marcel Apfelbaum
2018-05-22 6:03 ` Zihan Yang
2018-05-22 18:43 ` Marcel Apfelbaum
2018-05-22 9:52 ` Laszlo Ersek
2018-05-22 19:01 ` Marcel Apfelbaum
2018-05-22 19:51 ` Laszlo Ersek
2018-05-22 20:58 ` Michael S. Tsirkin
2018-05-22 21:36 ` Alex Williamson
2018-05-22 21:44 ` Michael S. Tsirkin
2018-05-22 21:47 ` Alex Williamson
2018-05-22 22:00 ` Laszlo Ersek
2018-05-22 23:38 ` Michael S. Tsirkin
2018-05-23 4:28 ` Alex Williamson
2018-05-23 14:25 ` Michael S. Tsirkin
2018-05-23 14:57 ` Alex Williamson
2018-05-23 15:01 ` Michael S. Tsirkin
2018-05-23 16:50 ` Marcel Apfelbaum [this message]
2018-05-22 21:17 ` Alex Williamson
2018-05-22 21:22 ` Michael S. Tsirkin
2018-05-22 21:58 ` Laszlo Ersek
2018-05-22 21:50 ` Laszlo Ersek
2018-05-23 17:00 ` Marcel Apfelbaum
2018-05-22 22:42 ` Laszlo Ersek
2018-05-22 23:40 ` Michael S. Tsirkin
2018-05-23 7:32 ` Laszlo Ersek
2018-05-23 11:11 ` Zihan Yang
2018-05-23 12:28 ` Laszlo Ersek
2018-05-23 17:23 ` Marcel Apfelbaum
2018-05-24 9:57 ` Zihan Yang
2018-05-23 17:33 ` Marcel Apfelbaum
2018-05-24 10:00 ` Zihan Yang
2018-05-23 17:11 ` Marcel Apfelbaum
2018-05-23 17:25 ` Laszlo Ersek
2018-05-28 11:02 ` Laszlo Ersek
2018-05-21 15:23 ` [Qemu-devel] [RFC 0/3] pci_expander_brdige: Put pxb host bridge into separate pci domain Marcel Apfelbaum
2018-05-22 6:04 ` Zihan Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=74728cc8-0e18-d344-8a88-cf54fd8dc95f@gmail.com \
--to=marcel.apfelbaum@gmail.com \
--cc=alex.williamson@redhat.com \
--cc=drjones@redhat.com \
--cc=eauger@redhat.com \
--cc=imammedo@redhat.com \
--cc=lersek@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=wei@redhat.com \
--cc=whois.zihan.yang@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).