qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Marcel Apfelbaum <marcel@redhat.com>
To: Laszlo Ersek <lersek@redhat.com>, qemu-devel@nongnu.org
Cc: mst@redhat.com
Subject: Re: [Qemu-devel] [PATCH] hw/pci: do not update the PCI mappings while Decode (I/O or memory) bit is not set in the Command register
Date: Mon, 11 Jan 2016 20:01:47 +0200	[thread overview]
Message-ID: <5693EE0B.7030002@redhat.com> (raw)
In-Reply-To: <5693E349.4090201@redhat.com>

On 01/11/2016 07:15 PM, Laszlo Ersek wrote:
> On 01/11/16 17:34, Marcel Apfelbaum wrote:
>> On 01/11/2016 06:11 PM, Laszlo Ersek wrote:
>>> On 01/11/16 13:24, Marcel Apfelbaum wrote:
>>>> Two reasons:
>>>>    - PCI Spec indicates that while the bit is not set
>>>>      the memory sizing is not finished.
>>>>    - pci_bar_address will return PCI_BAR_UNMAPPED
>>>>      and a previous value can be accidentally overridden
>>>>      if the command register is modified (and not the BAR).
>>>>
>>>> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
>>>> ---
>>>>
>>>> Hi,
>>>>
>>>> I found this when trying to use multiple root complexes with OVMF.
>>>>
>>>> When trying to attach a device to the pxb-pcie device as Integrated
>>>> Device it did not receive the IO/MEM resources.
>>>>
>>>> The reason is that OVMF is working like that:
>>>>    1. It disables the Decode (I/O or memory) bit in the Command register
>>>>    2. It configures the device BARS
>>>>    3. Makes some tests on the Command register
>>>>    4. ...
>>>>    5. Enables the Decode (I/O or memory) at some point.
>>>>
>>>> On step 3 all the BARS are overridden to 0xffffffff by QEMU.
>>>>
>>>> Since QEMU uses the device BARs to compute the new host bridge resources
>>>> it now gets garbage.
>>>>
>>>> Laszlo, this also solves the SHPC problem for the pci-2-pci bridge
>>>> inside the pxb.
>>>> Now we can enable the SHPC for it too.
>>>
>>> I encountered the exact same problem months ago. I posted patches for
>>> it; you were CC'd. :)
>>>
>>> http://thread.gmane.org/gmane.comp.emulators.qemu/342206/focus=342209
>>> http://thread.gmane.org/gmane.comp.emulators.qemu/342206/focus=342210
>>>
>>> As you can see under the second link above, I made the same analysis &
>>> observations as you do now. (It took me quite long to track down the
>>> "inexplicable" behavior of edk2's generic PCI bus driver / enumerator
>>> that is built into OVMF.)
>>
>> Wow, I just re-worked this issue again from 0! I wish I have remembered
>> those threads :(
>> This was another symptom of the exact problem! And I remembered
>> something about
>> SHPC, I should have looked at those mail threads again...
>>
>>>
>>> I proposed to change pci_bar_address() so that it could return, to
>>> distinguished callers, the BAR values "under programming", even if the
>>> command bits were clear. Then the ACPI generator would utilize this
>>> special exception.
>>>
>>> Michael disagreed; in
>>>
>>> http://thread.gmane.org/gmane.comp.emulators.qemu/342206/focus=342242
>>>
>>> he wrote "[t]his is problematic - disabled BAR values have no meaning
>>> according to the PCI spec".
>>>
>>
>> Yes... because it looked like a hook for our case only,
>> the good news is that this patch is based exactly on the fact that
>> the BARs have no meaning if the bit is not set.
>>
>>> The current  solution to the problem (= we disable the SHPC) was
>>> recommended by Michael in that message: "It might be best to add a
>>> property to just disable shpc in the bridge so no devices reside
>>> directly behind the pxb?"
>>>
>>
>> I confess I don't exactly understand what the SHPC of the pci-2-pci bridge
>> has to do with sibling devices on the pxb's root bus (SHPC is the
>> hot-plug controller
>> for the devices behind the pci-2-pci bridge).
>>
>> The second part I do understand, the pxb design was to not have devices
>> directly behind
>> the pxb, so maybe he meant that SHPC is the part of the pci-bridge that
>> behaves like
>> a device in the sense it requires IO/MEM resources.
>>
>> Bottom line, your solution for the PXB was just fine :)
>>
>>
>>> In comparison, your patch doesn't change pci_bar_address(). Instead, it
>>> modifies pci_update_mappings() *not to call* pci_bar_address(), if the
>>> respective command bits are clear.
>>>
>>> I guess that could have about the same effect.
>>>
>>> If, unlike my patch, yours actually improves QEMU's compliance with the
>>> PCI specs, then it's likely a good patch. (And apparently more general
>>> than the SHPC-specific solution we have now.)
>>
>>
>> Exactly! Why should a pci write to the command  register *delete*
>> previously set resources? I am looking at it as a bug.
>>
>> And also updating the mappings while the Decoding bit is not enables
>> is at least not necessary.
>>
>>>
>>> I just don't know if it's a good idea to leave any old mappings active
>>> while the BARs are being reprogrammed (with the command bits clear).
>>>
>>
>> First, because the OS can't use the IO/MEM anyway, secondly the guest
>> OS/firmware
>> is the one that disabled the bit... (in order to program resources)
>
> I have something like the following in mind. Do you think it is a valid
> (although contrived) use case?
>
> - guest programs some BAR and uses it (as designed / intended)
> - guest disables command bit, modifies BAR location
> - guest accesses *old* BAR location
>
> What should a guest *expect* in such a case? Is this invalid guest behavior?

Yes, this is indeed invalid behavior, from the device point of view
it is disabled. Best case scenario - the guest will see 0xffffffff,
worst case - garbage.

>
> If it is not invalid, then will QEMU comply with the guest's
> expectations if your patch is applied? Pre-patch, the guest would likely
> access a "hole" in the host bridge MMIO aperture, whereas with your
> patch (I guess?) it still might access the device through the old (still
> active) BAR?
>

Since the IO is disabled, pci_bar_address will return PCI_BAR_UNMAPPED
and *no updates will be made* pre or post this patch. It will behave the same
from the guest point of view. It will still access the memory region
of the device.


> Or would QEMU prevent that just by virtue of the command bit being clear?

Again, this patch only changes the behavior in a specific case:
when the device is disabled and the guest writes to the command register without
enabling IO/MEM.

Pre-patch -> the old BAR addresses are overridden with 0xffffffff
              (I really think this is a bug, nobody asked for this!!)
Post-Patch -> the old BAR values are returned to the prev state -> correct behavior IMHO.

Please continue to ask questions until we get to the bottom of it. :)

Thanks,
Marcel

>
> Thanks
> Laszlo
>
>>> In other words, what guarantees that this change will not regress
>>> anything? (I'm not doubting -- I'm asking; I honestly don't know.)
>>>
>>> So I guess I'll defer to Michael on this one.
>>
>> Michael, do you agree with the above?
>>
>>>
>>> In any case, I fully agree with your analysis of OVMF's behavior.
>>
>> Thanks! I looked for this bug in OVMF for some time now :)
>> Marcel
>>
>>>
>>> Thanks!
>>> Laszlo
>>>
>>>> Thanks,
>>>> Marcel
>>>>
>>>>    hw/pci/pci.c | 17 +++++++++++++++++
>>>>    1 file changed, 17 insertions(+)
>>>>
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index 168b9cc..f9127dc 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -1148,6 +1148,7 @@ static void pci_update_mappings(PCIDevice *d)
>>>>        PCIIORegion *r;
>>>>        int i;
>>>>        pcibus_t new_addr;
>>>> +    uint16_t cmd = pci_get_word(d->config + PCI_COMMAND);
>>>>
>>>>        for(i = 0; i < PCI_NUM_REGIONS; i++) {
>>>>            r = &d->io_regions[i];
>>>> @@ -1156,6 +1157,22 @@ static void pci_update_mappings(PCIDevice *d)
>>>>            if (!r->size)
>>>>                continue;
>>>>
>>>> +        /*
>>>> +         * Do not update the mappings until the command register's
>>>> +         * Decode (I/O or memory) bit is not set. Two reasons:
>>>> +         * - PCI Spec indicates that while the bit is not set
>>>> +         *   the memory sizing is not finished.
>>>> +         * - pci_bar_address will return PCI_BAR_UNMAPPED
>>>> +         *   and a previous value can be accidentally overridden
>>>> +         *   if the command register is modified (and not the BAR).
>>>> +         * */
>>>> +        if (((r->type & PCI_BASE_ADDRESS_SPACE_IO) &&
>>>> +             !(cmd & PCI_COMMAND_IO)) ||
>>>> +            ((r->type != PCI_BASE_ADDRESS_SPACE_IO) &&
>>>> +             !(cmd & PCI_COMMAND_MEMORY))) {
>>>> +            continue;
>>>> +        }
>>>> +
>>>>            new_addr = pci_bar_address(d, i, r->type, r->size);
>>>>
>>>>            /* This bar isn't changed */
>>>>
>>>
>>>
>>
>

  reply	other threads:[~2016-01-11 18:01 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-11 12:24 [Qemu-devel] [PATCH] hw/pci: do not update the PCI mappings while Decode (I/O or memory) bit is not set in the Command register Marcel Apfelbaum
2016-01-11 14:07 ` Igor Mammedov
2016-01-11 15:10   ` Marcel Apfelbaum
2016-01-11 16:11 ` Laszlo Ersek
2016-01-11 16:34   ` Marcel Apfelbaum
2016-01-11 17:15     ` Laszlo Ersek
2016-01-11 18:01       ` Marcel Apfelbaum [this message]
2016-01-11 18:44         ` Laszlo Ersek
2016-01-11 18:57           ` Marcel Apfelbaum
2016-01-14 12:24             ` Marcel Apfelbaum
2016-01-14 14:30               ` Laszlo Ersek
2016-01-14 14:49                 ` Michael S. Tsirkin
2016-01-14 15:23                   ` Marcel Apfelbaum
2016-01-14 15:37                     ` Michael S. Tsirkin
2016-01-14 17:20                       ` Marcel Apfelbaum
2016-01-14 17:28                         ` Michael S. Tsirkin
2016-01-14 18:25                           ` Marcel Apfelbaum
2016-01-14 15:14                 ` Marcel Apfelbaum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5693EE0B.7030002@redhat.com \
    --to=marcel@redhat.com \
    --cc=lersek@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).