From: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
To: Laszlo Ersek <lersek@redhat.com>,
Marcel Apfelbaum <marcel@redhat.com>,
qemu-devel@nongnu.org
Cc: mst@redhat.com
Subject: Re: [Qemu-devel] [PATCH] hw/pci: do not update the PCI mappings while Decode (I/O or memory) bit is not set in the Command register
Date: Mon, 11 Jan 2016 18:34:33 +0200 [thread overview]
Message-ID: <5693D999.2030504@gmail.com> (raw)
In-Reply-To: <5693D447.8070000@redhat.com>
On 01/11/2016 06:11 PM, Laszlo Ersek wrote:
> On 01/11/16 13:24, Marcel Apfelbaum wrote:
>> Two reasons:
>> - PCI Spec indicates that while the bit is not set
>> the memory sizing is not finished.
>> - pci_bar_address will return PCI_BAR_UNMAPPED
>> and a previous value can be accidentally overridden
>> if the command register is modified (and not the BAR).
>>
>> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
>> ---
>>
>> Hi,
>>
>> I found this when trying to use multiple root complexes with OVMF.
>>
>> When trying to attach a device to the pxb-pcie device as Integrated
>> Device it did not receive the IO/MEM resources.
>>
>> The reason is that OVMF is working like that:
>> 1. It disables the Decode (I/O or memory) bit in the Command register
>> 2. It configures the device BARS
>> 3. Makes some tests on the Command register
>> 4. ...
>> 5. Enables the Decode (I/O or memory) at some point.
>>
>> On step 3 all the BARS are overridden to 0xffffffff by QEMU.
>>
>> Since QEMU uses the device BARs to compute the new host bridge resources
>> it now gets garbage.
>>
>> Laszlo, this also solves the SHPC problem for the pci-2-pci bridge inside the pxb.
>> Now we can enable the SHPC for it too.
>
> I encountered the exact same problem months ago. I posted patches for
> it; you were CC'd. :)
>
> http://thread.gmane.org/gmane.comp.emulators.qemu/342206/focus=342209
> http://thread.gmane.org/gmane.comp.emulators.qemu/342206/focus=342210
>
> As you can see under the second link above, I made the same analysis &
> observations as you do now. (It took me quite long to track down the
> "inexplicable" behavior of edk2's generic PCI bus driver / enumerator
> that is built into OVMF.)
Wow, I just re-worked this issue again from 0! I wish I have remembered those threads :(
This was another symptom of the exact problem! And I remembered something about
SHPC, I should have looked at those mail threads again...
>
> I proposed to change pci_bar_address() so that it could return, to
> distinguished callers, the BAR values "under programming", even if the
> command bits were clear. Then the ACPI generator would utilize this
> special exception.
>
> Michael disagreed; in
>
> http://thread.gmane.org/gmane.comp.emulators.qemu/342206/focus=342242
>
> he wrote "[t]his is problematic - disabled BAR values have no meaning
> according to the PCI spec".
>
Yes... because it looked like a hook for our case only,
the good news is that this patch is based exactly on the fact that
the BARs have no meaning if the bit is not set.
> The current solution to the problem (= we disable the SHPC) was
> recommended by Michael in that message: "It might be best to add a
> property to just disable shpc in the bridge so no devices reside
> directly behind the pxb?"
>
I confess I don't exactly understand what the SHPC of the pci-2-pci bridge
has to do with sibling devices on the pxb's root bus (SHPC is the hot-plug controller
for the devices behind the pci-2-pci bridge).
The second part I do understand, the pxb design was to not have devices directly behind
the pxb, so maybe he meant that SHPC is the part of the pci-bridge that behaves like
a device in the sense it requires IO/MEM resources.
Bottom line, your solution for the PXB was just fine :)
> In comparison, your patch doesn't change pci_bar_address(). Instead, it
> modifies pci_update_mappings() *not to call* pci_bar_address(), if the
> respective command bits are clear.
>
> I guess that could have about the same effect.
>
> If, unlike my patch, yours actually improves QEMU's compliance with the
> PCI specs, then it's likely a good patch. (And apparently more general
> than the SHPC-specific solution we have now.)
Exactly! Why should a pci write to the command register *delete*
previously set resources? I am looking at it as a bug.
And also updating the mappings while the Decoding bit is not enables
is at least not necessary.
>
> I just don't know if it's a good idea to leave any old mappings active
> while the BARs are being reprogrammed (with the command bits clear).
>
First, because the OS can't use the IO/MEM anyway, secondly the guest OS/firmware
is the one that disabled the bit... (in order to program resources)
> In other words, what guarantees that this change will not regress
> anything? (I'm not doubting -- I'm asking; I honestly don't know.)
>
> So I guess I'll defer to Michael on this one.
Michael, do you agree with the above?
>
> In any case, I fully agree with your analysis of OVMF's behavior.
Thanks! I looked for this bug in OVMF for some time now :)
Marcel
>
> Thanks!
> Laszlo
>
>> Thanks,
>> Marcel
>>
>> hw/pci/pci.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>> index 168b9cc..f9127dc 100644
>> --- a/hw/pci/pci.c
>> +++ b/hw/pci/pci.c
>> @@ -1148,6 +1148,7 @@ static void pci_update_mappings(PCIDevice *d)
>> PCIIORegion *r;
>> int i;
>> pcibus_t new_addr;
>> + uint16_t cmd = pci_get_word(d->config + PCI_COMMAND);
>>
>> for(i = 0; i < PCI_NUM_REGIONS; i++) {
>> r = &d->io_regions[i];
>> @@ -1156,6 +1157,22 @@ static void pci_update_mappings(PCIDevice *d)
>> if (!r->size)
>> continue;
>>
>> + /*
>> + * Do not update the mappings until the command register's
>> + * Decode (I/O or memory) bit is not set. Two reasons:
>> + * - PCI Spec indicates that while the bit is not set
>> + * the memory sizing is not finished.
>> + * - pci_bar_address will return PCI_BAR_UNMAPPED
>> + * and a previous value can be accidentally overridden
>> + * if the command register is modified (and not the BAR).
>> + * */
>> + if (((r->type & PCI_BASE_ADDRESS_SPACE_IO) &&
>> + !(cmd & PCI_COMMAND_IO)) ||
>> + ((r->type != PCI_BASE_ADDRESS_SPACE_IO) &&
>> + !(cmd & PCI_COMMAND_MEMORY))) {
>> + continue;
>> + }
>> +
>> new_addr = pci_bar_address(d, i, r->type, r->size);
>>
>> /* This bar isn't changed */
>>
>
>
next prev parent reply other threads:[~2016-01-11 16:34 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-11 12:24 [Qemu-devel] [PATCH] hw/pci: do not update the PCI mappings while Decode (I/O or memory) bit is not set in the Command register Marcel Apfelbaum
2016-01-11 14:07 ` Igor Mammedov
2016-01-11 15:10 ` Marcel Apfelbaum
2016-01-11 16:11 ` Laszlo Ersek
2016-01-11 16:34 ` Marcel Apfelbaum [this message]
2016-01-11 17:15 ` Laszlo Ersek
2016-01-11 18:01 ` Marcel Apfelbaum
2016-01-11 18:44 ` Laszlo Ersek
2016-01-11 18:57 ` Marcel Apfelbaum
2016-01-14 12:24 ` Marcel Apfelbaum
2016-01-14 14:30 ` Laszlo Ersek
2016-01-14 14:49 ` Michael S. Tsirkin
2016-01-14 15:23 ` Marcel Apfelbaum
2016-01-14 15:37 ` Michael S. Tsirkin
2016-01-14 17:20 ` Marcel Apfelbaum
2016-01-14 17:28 ` Michael S. Tsirkin
2016-01-14 18:25 ` Marcel Apfelbaum
2016-01-14 15:14 ` Marcel Apfelbaum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5693D999.2030504@gmail.com \
--to=marcel.apfelbaum@gmail.com \
--cc=lersek@redhat.com \
--cc=marcel@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).