From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46093) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XIZco-000154-FG for qemu-devel@nongnu.org; Sat, 16 Aug 2014 04:46:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XIZcj-0003W1-5f for qemu-devel@nongnu.org; Sat, 16 Aug 2014 04:46:10 -0400 Received: from mout.web.de ([212.227.15.14]:50731) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XIZci-0003Vw-Rz for qemu-devel@nongnu.org; Sat, 16 Aug 2014 04:46:05 -0400 Message-ID: <53EF1A2D.4050709@web.de> Date: Sat, 16 Aug 2014 10:45:33 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <1407740702-4086-1-git-send-email-tamlokveer@gmail.com> <20140814111511.GS31346@redhat.com> <53ECA71F.6070700@web.de> <1408077757.11008.89.camel@ori.omang.mine.nu> <1408101325.14053.41.camel@abi.no.oracle.com> <1408175661.11008.113.camel@ori.omang.mine.nu> In-Reply-To: <1408175661.11008.113.camel@ori.omang.mine.nu> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="winkEs199fvdNUvT0O6IdgJxJKDEiKwwa" Subject: Re: [Qemu-devel] [PATCH v3 0/5] intel-iommu: introduce Intel IOMMU (VT-d) emulation to q35 chipset List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Knut Omang , Le Tan Cc: "Michael S. Tsirkin" , Stefan Weil , qemu-devel , Alex Williamson , Anthony Liguori , Paolo Bonzini This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --winkEs199fvdNUvT0O6IdgJxJKDEiKwwa Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 2014-08-16 09:54, Knut Omang wrote: > On Fri, 2014-08-15 at 19:37 +0800, Le Tan wrote: >> Hi Knut, >> >> 2014-08-15 19:15 GMT+08:00 Knut Omang : >>> On Fri, 2014-08-15 at 06:42 +0200, Knut Omang wrote: >>>> On Thu, 2014-08-14 at 14:10 +0200, Jan Kiszka wrote: >>>>> On 2014-08-14 13:15, Michael S. Tsirkin wrote: >>>>>> On Mon, Aug 11, 2014 at 03:04:57PM +0800, Le Tan wrote: >>>>>>> Hi, >>>>>>> >>>>>>> These patches are intended to introduce Intel IOMMU (VT-d) emulat= ion to q35 >>>>>>> chipset. The major job in these patches is to add support for emu= lating Intel >>>>>>> IOMMU according to the VT-d specification, including basic respon= ses to CSRs >>>>>>> accesses, the logics of DMAR (DMA remapping) and DMA memory addre= ss >>>>>>> translations. >>>>>> >>>>>> Thanks! >>>>>> Looks very good overall, I noted some coding style issues - I didn= 't >>>>>> bother reporting each issue in every place where it appears - repo= rted >>>>>> each issue once only, so please find and fix all instances of each= >>>>>> issue. >>>>> >>>>> BTW, because I was in urgent need for virtual test environment for >>>>> Jailhouse, I hacked interrupt remapping on top of Le's patches: >>>>> >>>>> http://git.kiszka.org/?p=3Dqemu.git;a=3Dshortlog;h=3Drefs/heads/que= ues/vtd-intremap >>>>> >>>>> The approach likely needs further discussions and refinements but i= t >>>>> already allows me to work on top with our hypervisor, and also Linu= x. >>>>> You can see from the last commit that Le's work made it pretty easy= to >>>>> build this on top. >>>> >>>> Le, >>>> >>>> I have tried Jan's branch with my device setup which consists of a >>>> minimal q35 setup, an ioh3420 root port (specified as -device >>>> ioh3420,slot=3D0 ) and a pcie device plugged into that root port, wh= ich >>>> gives the following lscpi -t: >>>> >>>> -[0000:00]-+-00.0 >>>> +-01.0 >>>> +-02.0 >>>> +-03.0-[01]----00.0 >>>> +-04.0 >>>> +-1f.0 >>>> +-1f.2 >>>> \-1f.3 >>>> >>>> All seems to work beautifully (I see the ISA bridge happily receive >>>> translations) until the first DMA from my device model (at 1:00.0) >>>> arrives, at which point I get: >>>> >>>> [ 1663.732413] dmar: DMAR:[DMA Write] Request device [00:03.0] fault= addr fffa0000 >>>> [ 1663.732413] DMAR:[fault reason 02] Present bit in context entry i= s clear >>>> >>>> I would have expected request device 01:00.0 for this. >>>> It is not clear to me yet if this is a weakness of the implementatio= n of >>>> ioh3420 or the iommu. Just wanted to let you know right away in case= you >>>> can shed some light to it or it is an easy fix, >>>> >>>> The device uses pci_dma_rw with itself as device pointer. >>> >>> To verify my hypothesis: with this rude hack my device now works much= >>> better: >>> >>> @@ -774,6 +780,8 @@ static void iommu_translate(VTDAddressSpace *vtd_= as, >>> int bus_num, int devfn, >>> is_fpd_set =3D ce.lo & VTD_CONTEXT_ENTRY_FPD; >>> } else { >>> ret_fr =3D dev_to_context_entry(s, bus_num, devfn, &ce); >>> + if (ret_fr) >>> + ret_fr =3D dev_to_context_entry(s, 1, 0, &ce); >>> is_fpd_set =3D ce.lo & VTD_CONTEXT_ENTRY_FPD; >>> if (ret_fr) { >>> ret_fr =3D -ret_fr; >>> >>> Looking at how things look on hardware, multiple devices often receiv= e >>> overlapping DMA address ranges for different physical addresses. >>> >>> So if I understand the way this works, every requester ID would also >>> need to have it's own unique VTDAddressSpace, as each pci >>> device/function sees a unique DMA address space.. >> >> ioh3420 is a pcie-to-pcie bridge, right?=20 >=20 > Yes. >=20 >> In my opinion, each pci-e >> device behind the pcie-to-pcie bridge can be assigned individually. >> For now I added the VT-d to q35 by just adding it to the root pci bus.= >> You can see here in q35.c: >> pci_setup_iommu(pci_bus, q35_host_dma_iommu, mch->iommu); >> So if we add a pcie-to-pcie bridge, we may have to call the >> pci_setup_iommu() for that new bus. I don't know where to hook into >> this now. :) If you know the mechanism behind that, you can try to add= >> that for the new bus. (I will dive into this after the clean up.) >> What do you think? >=20 > Thanks for the quick answer, that helped a lot! >=20 > Looking into the details here I realize it is slightly more complicated= : > secondary buses are enumerated after device instantiation, as part of > the host PCI enumeration, so if I add a similar setup call in the bridg= e > setup, it will be called for a new device long before it has received > it's bus number from the OS (via config[PCI_SECONDARY_BUS] ) >=20 > I agree that the lookup function for contexts needs to be as efficient > as possible so the simple lookup key may be the best > solution but then the address_spaces table cannot be populated with the= > secondary bus entries before it receives a nonzero !=3D 255 bus number,= > eg. along the lines of this:=20 >=20 > diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c > index 4becdc1..d9a8c23 100644 > --- a/hw/pci/pci_bridge.c > +++ b/hw/pci/pci_bridge.c > @@ -265,6 +265,12 @@ void pci_bridge_write_config(PCIDevice *d, > pci_bridge_update_mappings(s); > } > =20 > + if (ranges_overlap(address, len, PCI_SECONDARY_BUS, 1)) { > + int bus_num =3D pci_bus_num(&s->sec_bus); > + if (bus_num !=3D 0xff && bus_num !=3D 0x00) > + > + } > + > newctl =3D pci_get_word(d->config + PCI_BRIDGE_CONTROL); > if (~oldctl & newctl & PCI_BRIDGE_CTL_BUS_RESET) { > /* Trigger hot reset on 0->1 transition. */ >=20 > but it is getting complicated... > Thoughts? Point to the PCI bus from VTDAddressSpace instead of storing the bus_num there? Jan --winkEs199fvdNUvT0O6IdgJxJKDEiKwwa Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEARECAAYFAlPvGjAACgkQitSsb3rl5xQU5QCgvG+rM8oAF19zbhNNKJ/LjCFL 6+4AoNz/Ri5EdQ7oiCKzi9b7+yimdqPY =wfZX -----END PGP SIGNATURE----- --winkEs199fvdNUvT0O6IdgJxJKDEiKwwa--