From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexey Kardashevskiy Subject: Re: [RFC PATCH] qemu pci: pci_add_capability enhancement to prevent damaging config space Date: Fri, 18 May 2012 15:12:35 +1000 Message-ID: <4FB5DA43.90907@ozlabs.ru> References: <4FACB581.2050609@ozlabs.ru> <6A22E211-BC82-49BD-A335-02D3BAA14A17@suse.de> <4FAD0A4F.2050506@ozlabs.ru> <4FB080CE.3030703@ozlabs.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: qemu-devel@nongnu.org, Alex Williamson , Benjamin Herrenschmidt , David Gibson , anthony@codemonkey.ws, kvm@vger.kernel.org To: Alexander Graf Return-path: Received: from mail-pz0-f46.google.com ([209.85.210.46]:57705 "EHLO mail-pz0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757408Ab2ERFMq (ORCPT ); Fri, 18 May 2012 01:12:46 -0400 Received: by dady13 with SMTP id y13so3350096dad.19 for ; Thu, 17 May 2012 22:12:45 -0700 (PDT) In-Reply-To: <4FB080CE.3030703@ozlabs.ru> Sender: kvm-owner@vger.kernel.org List-ID: Alexander, Is that any better? :) @@ -1779,11 +1779,29 @@ static void pci_del_option_rom(PCIDevice *pdev) * in pci config space */ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id, uint8_t offset, uint8_t size) { - uint8_t *config; + uint8_t *config, existing; int i, overlapping_cap; + existing =3D pci_find_capability(pdev, cap_id); + if (existing) { + if (offset && (existing !=3D offset)) { + return -EEXIST; + } + for (i =3D existing; i < size; ++i) { + if (pdev->used[i]) { + return -EFAULT; + } + } + memset(pdev->used + offset, 0xFF, size); + /* Make capability read-only by default */ + memset(pdev->wmask + offset, 0, size); + /* Check capability by default */ + memset(pdev->cmask + offset, 0xFF, size); + return existing; + } + if (!offset) { offset =3D pci_find_space(pdev, size); if (!offset) { return -ENOSPC; On 14/05/12 13:49, Alexey Kardashevskiy wrote: > On 12/05/12 00:13, Alexander Graf wrote: >> >> On 11.05.2012, at 14:47, Alexey Kardashevskiy wrote: >> >>> 11.05.2012 20:52, Alexander Graf =CE=C1=D0=C9=D3=C1=CC: >>>> >>>> On 11.05.2012, at 08:45, Alexey Kardashevskiy wrote: >>>> >>>>> Normally the pci_add_capability is called on devices to add new >>>>> capability. This is ok for emulated devices which capabilities li= st >>>>> is being built by QEMU. >>>>> >>>>> In the case of VFIO the capability may already exist and adding n= ew >>>>> capability into the beginning of the linked list may create a loo= p. >>>>> >>>>> For example, the old code destroys the following config >>>>> of PCIe Intel E1000E: >>>>> >>>>> before adding PCI_CAP_ID_MSI (0x05): >>>>> 0x34: 0xC8 >>>>> 0xC8: 0x01 0xD0 >>>>> 0xD0: 0x05 0xE0 >>>>> 0xE0: 0x10 0x00 >>>>> >>>>> after: >>>>> 0x34: 0xD0 >>>>> 0xC8: 0x01 0xD0 >>>>> 0xD0: 0x05 0xC8 >>>>> 0xE0: 0x10 0x00 >>>>> >>>>> As result capabilities 0x01 and 0x05 point to each other. >>>>> >>>>> The proposed patch does not change capability pointers when >>>>> the same type capability is about to add. >>>>> >>>>> Signed-off-by: Alexey Kardashevskiy >>>>> --- >>>>> hw/pci.c | 10 ++++++---- >>>>> 1 files changed, 6 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/hw/pci.c b/hw/pci.c >>>>> index aa0c0b8..1f7c924 100644 >>>>> --- a/hw/pci.c >>>>> +++ b/hw/pci.c >>>>> @@ -1794,10 +1794,12 @@ int pci_add_capability(PCIDevice *pdev, u= int8_t cap_id, >>>>> } >>>>> >>>>> config =3D pdev->config + offset; >>>>> - config[PCI_CAP_LIST_ID] =3D cap_id; >>>>> - config[PCI_CAP_LIST_NEXT] =3D pdev->config[PCI_CAPABILITY_LI= ST]; >>>>> - pdev->config[PCI_CAPABILITY_LIST] =3D offset; >>>>> - pdev->config[PCI_STATUS] |=3D PCI_STATUS_CAP_LIST; >>>>> + if (config[PCI_CAP_LIST_ID] !=3D cap_id) { >>>> >>>> This doesn't scale. Capabilities are a list of CAPs. You'll have t= o do a loop through all capabilities, check if the one you want to add = is there already and if so either >>>> * replace the existing one or >>>> * drop out and not write the new one in. >> >> * hw_error :) >> >>>> >>>> I'm not sure which way would be more natural. >>> >>> There is a third option - add another function, lets call it >>> pci_fixup_capability() which would do whatever pci_add_capability()= does >>> but won't touch list pointers. >> >> What good is a function that breaks internal consistency? >=20 >=20 > It is broken already by having PCIDevice.used field. Normally pci_add= _capability() would go through > the whole list and add a capability if it does not exist. Emulated de= vices which care about having a > capability at some fixed offset would have initialized their config s= pace before calling this > capabilities API (as VFIO does). >=20 > If we really want to support emulated devices which want some capabil= ities be at fixed offset and > others at random offsets (strange, but ok), I do not see how it is ba= d to restore this consistency > by special function (pci_fixup_capability()) to avoid its rewriting a= t different location as a guest > driver may care about its offset. >=20 >=20 >=20 >>> When vfio, pci_add_capability() is called from the code which knows >>> exactly that the capability exists and where it is and it calls >>> pci_add_capability() based on this knowledge so doing additional lo= ops >>> just for imaginery scalability is a bit weird, no? >> >> Not sure I understand your proposal. The more generic a framework is= , the better, no? In this code path we don't care about speed. We only = care about consistency and reliability. >> >> >> Alex >> >=20 >=20 --=20 Alexey