From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: kvm@vger.kernel.org, Alexander Graf <agraf@suse.de>,
qemu-devel@nongnu.org,
Alex Williamson <alex.williamson@redhat.com>,
anthony@codemonkey.ws, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [RFC PATCH] qemu pci: pci_add_capability enhancement to prevent damaging config space
Date: Tue, 22 May 2012 12:02:50 +1000 [thread overview]
Message-ID: <1337652170.2779.143.camel@pasglop> (raw)
In-Reply-To: <4FB5DA43.90907@ozlabs.ru>
On Fri, 2012-05-18 at 15:12 +1000, Alexey Kardashevskiy wrote:
> Alexander,
>
> Is that any better? :)
Alex (Graf that is), ping ?
The original patch from Alexey was fine btw.
VFIO will always call things with the existing capability offset so
there's no real risk of doing the wrong thing or break the list or
anything.
IE. A small simple patch that addresses the problem :-)
The new patch is a bit more "robust" I believe, I don't think we need to
go too far to fix a problem we don't have. But we need a fix for the
real issue and the simple patch does it neatly from what I can
understand.
Cheers,
Ben.
>
> @@ -1779,11 +1779,29 @@ static void pci_del_option_rom(PCIDevice *pdev)
> * in pci config space */
> int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
> uint8_t offset, uint8_t size)
> {
> - uint8_t *config;
> + uint8_t *config, existing;
> int i, overlapping_cap;
>
> + existing = pci_find_capability(pdev, cap_id);
> + if (existing) {
> + if (offset && (existing != offset)) {
> + return -EEXIST;
> + }
> + for (i = existing; i < size; ++i) {
> + if (pdev->used[i]) {
> + return -EFAULT;
> + }
> + }
> + memset(pdev->used + offset, 0xFF, size);
> + /* Make capability read-only by default */
> + memset(pdev->wmask + offset, 0, size);
> + /* Check capability by default */
> + memset(pdev->cmask + offset, 0xFF, size);
> + return existing;
> + }
> +
> if (!offset) {
> offset = pci_find_space(pdev, size);
> if (!offset) {
> return -ENOSPC;
>
>
>
>
>
>
> On 14/05/12 13:49, Alexey Kardashevskiy wrote:
> > On 12/05/12 00:13, Alexander Graf wrote:
> >>
> >> On 11.05.2012, at 14:47, Alexey Kardashevskiy wrote:
> >>
> >>> 11.05.2012 20:52, Alexander Graf написал:
> >>>>
> >>>> On 11.05.2012, at 08:45, Alexey Kardashevskiy wrote:
> >>>>
> >>>>> Normally the pci_add_capability is called on devices to add new
> >>>>> capability. This is ok for emulated devices which capabilities list
> >>>>> is being built by QEMU.
> >>>>>
> >>>>> In the case of VFIO the capability may already exist and adding new
> >>>>> capability into the beginning of the linked list may create a loop.
> >>>>>
> >>>>> For example, the old code destroys the following config
> >>>>> of PCIe Intel E1000E:
> >>>>>
> >>>>> before adding PCI_CAP_ID_MSI (0x05):
> >>>>> 0x34: 0xC8
> >>>>> 0xC8: 0x01 0xD0
> >>>>> 0xD0: 0x05 0xE0
> >>>>> 0xE0: 0x10 0x00
> >>>>>
> >>>>> after:
> >>>>> 0x34: 0xD0
> >>>>> 0xC8: 0x01 0xD0
> >>>>> 0xD0: 0x05 0xC8
> >>>>> 0xE0: 0x10 0x00
> >>>>>
> >>>>> As result capabilities 0x01 and 0x05 point to each other.
> >>>>>
> >>>>> The proposed patch does not change capability pointers when
> >>>>> the same type capability is about to add.
> >>>>>
> >>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>>>> ---
> >>>>> hw/pci.c | 10 ++++++----
> >>>>> 1 files changed, 6 insertions(+), 4 deletions(-)
> >>>>>
> >>>>> diff --git a/hw/pci.c b/hw/pci.c
> >>>>> index aa0c0b8..1f7c924 100644
> >>>>> --- a/hw/pci.c
> >>>>> +++ b/hw/pci.c
> >>>>> @@ -1794,10 +1794,12 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
> >>>>> }
> >>>>>
> >>>>> config = pdev->config + offset;
> >>>>> - config[PCI_CAP_LIST_ID] = cap_id;
> >>>>> - config[PCI_CAP_LIST_NEXT] = pdev->config[PCI_CAPABILITY_LIST];
> >>>>> - pdev->config[PCI_CAPABILITY_LIST] = offset;
> >>>>> - pdev->config[PCI_STATUS] |= PCI_STATUS_CAP_LIST;
> >>>>> + if (config[PCI_CAP_LIST_ID] != cap_id) {
> >>>>
> >>>> This doesn't scale. Capabilities are a list of CAPs. You'll have to do a loop through all capabilities, check if the one you want to add is there already and if so either
> >>>> * replace the existing one or
> >>>> * drop out and not write the new one in.
> >>
> >> * hw_error :)
> >>
> >>>>
> >>>> I'm not sure which way would be more natural.
> >>>
> >>> There is a third option - add another function, lets call it
> >>> pci_fixup_capability() which would do whatever pci_add_capability() does
> >>> but won't touch list pointers.
> >>
> >> What good is a function that breaks internal consistency?
> >
> >
> > It is broken already by having PCIDevice.used field. Normally pci_add_capability() would go through
> > the whole list and add a capability if it does not exist. Emulated devices which care about having a
> > capability at some fixed offset would have initialized their config space before calling this
> > capabilities API (as VFIO does).
> >
> > If we really want to support emulated devices which want some capabilities be at fixed offset and
> > others at random offsets (strange, but ok), I do not see how it is bad to restore this consistency
> > by special function (pci_fixup_capability()) to avoid its rewriting at different location as a guest
> > driver may care about its offset.
> >
> >
> >
> >>> When vfio, pci_add_capability() is called from the code which knows
> >>> exactly that the capability exists and where it is and it calls
> >>> pci_add_capability() based on this knowledge so doing additional loops
> >>> just for imaginery scalability is a bit weird, no?
> >>
> >> Not sure I understand your proposal. The more generic a framework is, the better, no? In this code path we don't care about speed. We only care about consistency and reliability.
> >>
> >>
> >> Alex
> >>
> >
> >
>
>
next prev parent reply other threads:[~2012-05-22 2:03 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-11 6:45 [Qemu-devel] [RFC PATCH] qemu pci: pci_add_capability enhancement to prevent damaging config space Alexey Kardashevskiy
2012-05-11 10:52 ` Alexander Graf
2012-05-11 12:47 ` Alexey Kardashevskiy
2012-05-11 14:13 ` Alexander Graf
2012-05-14 3:49 ` Alexey Kardashevskiy
2012-05-18 5:12 ` Alexey Kardashevskiy
2012-05-22 2:02 ` Benjamin Herrenschmidt [this message]
2012-05-22 3:21 ` Alexander Graf
2012-05-22 3:44 ` Alexey Kardashevskiy
2012-05-22 5:52 ` Alexander Graf
2012-05-22 6:11 ` Alexey Kardashevskiy
2012-05-22 6:31 ` Alexander Graf
2012-05-22 7:01 ` Alexey Kardashevskiy
2012-05-22 7:13 ` Alexander Graf
2012-05-22 7:37 ` Benjamin Herrenschmidt
2012-06-08 8:47 ` Alexey Kardashevskiy
2012-06-08 10:56 ` Jan Kiszka
2012-06-08 11:16 ` Alexey Kardashevskiy
2012-06-08 11:30 ` Jan Kiszka
2012-06-08 14:00 ` Alexey Kardashevskiy
2012-06-08 14:43 ` Jan Kiszka
2012-06-08 14:56 ` Alex Williamson
2012-06-08 15:05 ` Jan Kiszka
2012-06-08 15:22 ` Alex Williamson
2012-05-22 6:38 ` Alexander Graf
2012-05-11 19:20 ` Jason Baron
2012-05-12 0:27 ` Alexey Kardashevskiy
2012-05-14 2:37 ` Alex Williamson
-- strict thread matches above, loose matches on Subject: below --
2012-05-11 6:59 Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1337652170.2779.143.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=david@gibson.dropbear.id.au \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).