qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Alexander Graf <agraf@suse.de>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	"anthony@codemonkey.ws" <anthony@codemonkey.ws>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [RFC PATCH] qemu pci: pci_add_capability enhancement to prevent damaging config space
Date: Tue, 22 May 2012 13:44:21 +1000	[thread overview]
Message-ID: <4FBB0B95.8050901@ozlabs.ru> (raw)
In-Reply-To: <6C472F5B-B8C3-48DE-B19B-00973AF6AC56@suse.de>

On 22/05/12 13:21, Alexander Graf wrote:
> 
> 
> On 22.05.2012, at 04:02, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
>> On Fri, 2012-05-18 at 15:12 +1000, Alexey Kardashevskiy wrote:
>>> Alexander,
>>>
>>> Is that any better? :)
>>
>> Alex (Graf that is), ping ?
>>
>> The original patch from Alexey was fine btw.
>>
>> VFIO will always call things with the existing capability offset so
>> there's no real risk of doing the wrong thing or break the list or
>> anything.
>>
>> IE. A small simple patch that addresses the problem :-)
>>
>> The new patch is a bit more "robust" I believe, I don't think we need to
>> go too far to fix a problem we don't have. But we need a fix for the
>> real issue and the simple patch does it neatly from what I can
>> understand.
>>
>> Cheers,
>> Ben.
>>
>>>
>>> @@ -1779,11 +1779,29 @@ static void pci_del_option_rom(PCIDevice *pdev)
>>>  * in pci config space */
>>> int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
>>>                        uint8_t offset, uint8_t size)
>>> {
>>> -    uint8_t *config;
>>> +    uint8_t *config, existing;
> 
> Existing is a pointer to the target dev's config space, right?

Yes.

>>>     int i, overlapping_cap;
>>>
>>> +    existing = pci_find_capability(pdev, cap_id);
>>> +    if (existing) {
>>> +        if (offset && (existing != offset)) {
>>> +            return -EEXIST;
>>> +        }
>>> +        for (i = existing; i < size; ++i) {
> 
> So how does this possibly make sense?

Although I do not expect VFIO to add capabilities (does not make sense), I still want to double
check that this space has not been tried to use by someone else.

>>> +            if (pdev->used[i]) {
>>> +                return -EFAULT;
>>> +            }
>>> +        }
>>> +        memset(pdev->used + offset, 0xFF, size);
> Why?

Because I am marking the space this capability takes as used.

>>> +        /* Make capability read-only by default */
>>> +        memset(pdev->wmask + offset, 0, size);
> Why?

Because the pci_add_capability() does it for a new capability by default.


>>> +        /* Check capability by default */
>>> +        memset(pdev->cmask + offset, 0xFF, size);
> 
> I don't understand this part either.

The pci_add_capability() does it for a new capability by default.



> 
> Alex
> 
>>> +        return existing;
>>> +    }
>>> +
>>>     if (!offset) {
>>>         offset = pci_find_space(pdev, size);
>>>         if (!offset) {
>>>             return -ENOSPC;
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 14/05/12 13:49, Alexey Kardashevskiy wrote:
>>>> On 12/05/12 00:13, Alexander Graf wrote:
>>>>>
>>>>> On 11.05.2012, at 14:47, Alexey Kardashevskiy wrote:
>>>>>
>>>>>> 11.05.2012 20:52, Alexander Graf написал:
>>>>>>>
>>>>>>> On 11.05.2012, at 08:45, Alexey Kardashevskiy wrote:
>>>>>>>
>>>>>>>> Normally the pci_add_capability is called on devices to add new
>>>>>>>> capability. This is ok for emulated devices which capabilities list
>>>>>>>> is being built by QEMU.
>>>>>>>>
>>>>>>>> In the case of VFIO the capability may already exist and adding new
>>>>>>>> capability into the beginning of the linked list may create a loop.
>>>>>>>>
>>>>>>>> For example, the old code destroys the following config
>>>>>>>> of PCIe Intel E1000E:
>>>>>>>>
>>>>>>>> before adding PCI_CAP_ID_MSI (0x05):
>>>>>>>> 0x34: 0xC8
>>>>>>>> 0xC8: 0x01 0xD0
>>>>>>>> 0xD0: 0x05 0xE0
>>>>>>>> 0xE0: 0x10 0x00
>>>>>>>>
>>>>>>>> after:
>>>>>>>> 0x34: 0xD0
>>>>>>>> 0xC8: 0x01 0xD0
>>>>>>>> 0xD0: 0x05 0xC8
>>>>>>>> 0xE0: 0x10 0x00
>>>>>>>>
>>>>>>>> As result capabilities 0x01 and 0x05 point to each other.
>>>>>>>>
>>>>>>>> The proposed patch does not change capability pointers when
>>>>>>>> the same type capability is about to add.
>>>>>>>>
>>>>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>>>>>> ---
>>>>>>>> hw/pci.c |   10 ++++++----
>>>>>>>> 1 files changed, 6 insertions(+), 4 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/hw/pci.c b/hw/pci.c
>>>>>>>> index aa0c0b8..1f7c924 100644
>>>>>>>> --- a/hw/pci.c
>>>>>>>> +++ b/hw/pci.c
>>>>>>>> @@ -1794,10 +1794,12 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
>>>>>>>>   }
>>>>>>>>
>>>>>>>>   config = pdev->config + offset;
>>>>>>>> -    config[PCI_CAP_LIST_ID] = cap_id;
>>>>>>>> -    config[PCI_CAP_LIST_NEXT] = pdev->config[PCI_CAPABILITY_LIST];
>>>>>>>> -    pdev->config[PCI_CAPABILITY_LIST] = offset;
>>>>>>>> -    pdev->config[PCI_STATUS] |= PCI_STATUS_CAP_LIST;
>>>>>>>> +    if (config[PCI_CAP_LIST_ID] != cap_id) {
>>>>>>>
>>>>>>> This doesn't scale. Capabilities are a list of CAPs. You'll have to do a loop through all capabilities, check if the one you want to add is there already and if so either
>>>>>>> * replace the existing one or
>>>>>>> * drop out and not write the new one in.
>>>>>
>>>>>  * hw_error :)
>>>>>
>>>>>>>
>>>>>>> I'm not sure which way would be more natural.
>>>>>>
>>>>>> There is a third option - add another function, lets call it
>>>>>> pci_fixup_capability() which would do whatever pci_add_capability() does
>>>>>> but won't touch list pointers.
>>>>>
>>>>> What good is a function that breaks internal consistency?
>>>>
>>>>
>>>> It is broken already by having PCIDevice.used field. Normally pci_add_capability() would go through
>>>> the whole list and add a capability if it does not exist. Emulated devices which care about having a
>>>> capability at some fixed offset would have initialized their config space before calling this
>>>> capabilities API (as VFIO does).
>>>>
>>>> If we really want to support emulated devices which want some capabilities be at fixed offset and
>>>> others at random offsets (strange, but ok), I do not see how it is bad to restore this consistency
>>>> by special function (pci_fixup_capability()) to avoid its rewriting at different location as a guest
>>>> driver may care about its offset.
>>>>
>>>>
>>>>
>>>>>> When vfio, pci_add_capability() is called from the code which knows
>>>>>> exactly that the capability exists and where it is and it calls
>>>>>> pci_add_capability() based on this knowledge so doing additional loops
>>>>>> just for imaginery scalability is a bit weird, no?
>>>>>
>>>>> Not sure I understand your proposal. The more generic a framework is, the better, no? In this code path we don't care about speed. We only care about consistency and reliability.
>>>>>
>>>>>
>>>>> Alex
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>


-- 
Alexey

  reply	other threads:[~2012-05-22  3:44 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-11  6:45 [Qemu-devel] [RFC PATCH] qemu pci: pci_add_capability enhancement to prevent damaging config space Alexey Kardashevskiy
2012-05-11 10:52 ` Alexander Graf
2012-05-11 12:47   ` Alexey Kardashevskiy
2012-05-11 14:13     ` Alexander Graf
2012-05-14  3:49       ` Alexey Kardashevskiy
2012-05-18  5:12         ` Alexey Kardashevskiy
2012-05-22  2:02           ` Benjamin Herrenschmidt
2012-05-22  3:21             ` Alexander Graf
2012-05-22  3:44               ` Alexey Kardashevskiy [this message]
2012-05-22  5:52                 ` Alexander Graf
2012-05-22  6:11                   ` Alexey Kardashevskiy
2012-05-22  6:31                     ` Alexander Graf
2012-05-22  7:01                       ` Alexey Kardashevskiy
2012-05-22  7:13                         ` Alexander Graf
2012-05-22  7:37                           ` Benjamin Herrenschmidt
2012-06-08  8:47                       ` Alexey Kardashevskiy
2012-06-08 10:56                         ` Jan Kiszka
2012-06-08 11:16                           ` Alexey Kardashevskiy
2012-06-08 11:30                             ` Jan Kiszka
2012-06-08 14:00                               ` Alexey Kardashevskiy
2012-06-08 14:43                                 ` Jan Kiszka
2012-06-08 14:56                                   ` Alex Williamson
2012-06-08 15:05                                     ` Jan Kiszka
2012-06-08 15:22                                       ` Alex Williamson
2012-05-22  6:38                     ` Alexander Graf
2012-05-11 19:20 ` Jason Baron
2012-05-12  0:27   ` Alexey Kardashevskiy
2012-05-14  2:37     ` Alex Williamson
  -- strict thread matches above, loose matches on Subject: below --
2012-05-11  6:59 Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FBB0B95.8050901@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=agraf@suse.de \
    --cc=alex.williamson@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=david@gibson.dropbear.id.au \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).