From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Oliver O'Halloran <oohall@gmail.com>
Cc: KVM list <kvm@vger.kernel.org>,
Fabiano Rosas <farosas@linux.ibm.com>,
Alistair Popple <alistair@popple.id.au>,
kvm-ppc@vger.kernel.org,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH kernel v2 0/7] powerpc/powenv/ioda: Allow huge DMA window at 4GB
Date: Wed, 22 Apr 2020 16:49:06 +1000 [thread overview]
Message-ID: <ee3fd87f-f2b6-1439-a310-fedc614e6155@ozlabs.ru> (raw)
In-Reply-To: <CAOSf1CG_qiR2HvSFVTbgTyqVmDt4+Oy60PNWY23K2ihHib1K7Q@mail.gmail.com>
On 21/04/2020 16:35, Oliver O'Halloran wrote:
> On Tue, Apr 21, 2020 at 3:11 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>
>> One example of a problem device is AMD GPU with 64bit video PCI function
>> and 32bit audio, no?
>>
>> What PEs will they get assigned to now? Where will audio's MMIO go? It
>> cannot be the same 64bit MMIO segment, right? If so, it is a separate PE
>> already. If not, then I do not understand "we're free to assign whatever
>> PE number we want.
>
> The BARs stay in the same place and as far as MMIO is concerned
> nothing has changed. For MMIO the PHB uses the MMIO address to find a
> PE via the M64 BAR table, but for DMA it uses a *completely* different
> mechanism. Instead it takes the BDFN (included in the DMA packet
> header) and the Requester Translation Table (RTT) to map the BDFN to a
> PE. Normally you would configure the PHB so the same PE used for MMIO
> and DMA, but you don't have to.
32bit MMIO is what puzzles me in this picture, how does it work?
>>> I think the key thing to realise is that we'd only be using the DMA-PE
>>> when a crippled DMA mask is set by the driver. In all other cases we
>>> can just use the "native PE" and when the driver unbinds we can de-
>>> allocate our DMA-PE and return the device to the PE containing it's
>>> MMIO BARs. I think we can keep things relatively sane that way and the
>>> real issue is detecting EEH events on the DMA-PE.
>>
>>
>> Oooor we could just have 1 DMA window (or, more precisely, a single
>> "TVE" as it is either window or bypass) per a PE and give every function
>> its own PE and create a window or a table when a device sets a DMA mask.
>> I feel I am missing something here though.
>
> Yes, we could do that, but do we want to?
>
> I was thinking we should try minimise the number of DMA-only PEs since
> it complicates the EEH freeze handling. When MMIO and DMA are mapped
> to the same PE an error on either will cause the hardware to stop
> both. When seperate PEs are used for DMA and MMIO you lose that
> atomicity. It's not a big deal if DMA is stopped and MMIO allowed
> since PAPR (sort-of) allows that, but having MMIO frozen with DMA
> unfrozen is a bit sketch.
You suggested using slave PEs for crippled functions - won't we have the
same problem then?
And is this "slave PE" something the hardware supports or it is a
software concept?
>>>> For the time being, this patchset is good for:
>>>> 1. weird hardware which has limited DMA mask (this is why the patchset
>>>> was written in the first place)
>>>> 2. debug DMA by routing it via IOMMU (even when 4GB hack is not enabled).
>>>
>>> Sure, but it's still dependent on having firmware which supports the
>>> 4GB hack and I don't think that's in any offical firmware releases yet.
>>
>> It's been a while :-/
>
> There's been no official FW releases with a skiboot that supports the
> phb get/set option opal calls so the only systems that can actually
> take advantage of it are our lab systems. It might still be useful for
> future systems, but I'd rather something that doesn't depend on FW
> support.
Pensando folks use it ;)
--
Alexey
next prev parent reply other threads:[~2020-04-22 6:51 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-23 7:53 [PATCH kernel v2 0/7] powerpc/powenv/ioda: Allow huge DMA window at 4GB Alexey Kardashevskiy
2020-03-23 7:53 ` [PATCH kernel v2 1/7] powerpc/powernv/ioda: Move TCE bypass base to PE Alexey Kardashevskiy
2020-03-23 7:53 ` [PATCH kernel v2 2/7] powerpc/powernv/ioda: Rework for huge DMA window at 4GB Alexey Kardashevskiy
2020-03-23 7:53 ` [PATCH kernel v2 3/7] powerpc/powernv/ioda: Allow smaller TCE table levels Alexey Kardashevskiy
2020-03-23 7:53 ` [PATCH kernel v2 4/7] powerpc/powernv/phb4: Use IOMMU instead of bypassing Alexey Kardashevskiy
2020-03-23 7:53 ` [PATCH kernel v2 5/7] powerpc/iommu: Add a window number to iommu_table_group_ops::get_table_size Alexey Kardashevskiy
2020-03-23 7:53 ` [PATCH kernel v2 6/7] powerpc/powernv/phb4: Add 4GB IOMMU bypass mode Alexey Kardashevskiy
2020-03-23 7:53 ` [PATCH kernel v2 7/7] vfio/spapr_tce: Advertise and allow a huge DMA windows at 4GB Alexey Kardashevskiy
2020-04-08 9:43 ` [PATCH kernel v2 0/7] powerpc/powenv/ioda: Allow huge DMA window " Alexey Kardashevskiy
2020-04-16 1:27 ` Alexey Kardashevskiy
2020-04-16 2:34 ` Oliver O'Halloran
2020-04-16 2:53 ` Oliver O'Halloran
2020-04-17 1:26 ` Russell Currey
2020-04-17 5:47 ` Alexey Kardashevskiy
2020-04-20 14:04 ` Oliver O'Halloran
2020-04-21 5:11 ` Alexey Kardashevskiy
2020-04-21 6:35 ` Oliver O'Halloran
2020-04-22 6:49 ` Alexey Kardashevskiy [this message]
2020-04-22 9:11 ` Oliver O'Halloran
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ee3fd87f-f2b6-1439-a310-fedc614e6155@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=alistair@popple.id.au \
--cc=david@gibson.dropbear.id.au \
--cc=farosas@linux.ibm.com \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=oohall@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).