From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Oliver O'Halloran <oohall@gmail.com>
Cc: "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH 15/15] powerpc/powernv/sriov: Make single PE mode a per-BAR setting
Date: Wed, 15 Jul 2020 18:00:06 +1000 [thread overview]
Message-ID: <25d7fd88-668a-861e-a93c-3188caeac3cf@ozlabs.ru> (raw)
In-Reply-To: <CAOSf1CHL9YoSohwMWm1YkLbLTqOn-WfBMKERZaPYb_5-UKmsuw@mail.gmail.com>
On 15/07/2020 16:16, Oliver O'Halloran wrote:
> On Wed, Jul 15, 2020 at 3:24 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>
>>
>>> @@ -158,9 +157,9 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>>> goto disable_iov;
>>> pdev->dev.archdata.iov_data = iov;
>>>
>>> + /* FIXME: totalvfs > phb->ioda.total_pe_num is going to be a problem */
>>
>>
>> WARN_ON_ONCE() then?
>
> can't hurt
>
>>> @@ -173,50 +172,51 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>>> goto disable_iov;
>>> }
>>>
>>> - total_vf_bar_sz += pci_iov_resource_size(pdev,
>>> - i + PCI_IOV_RESOURCES);
>>> + vf_bar_sz = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES);
>>>
>>> /*
>>> - * If bigger than quarter of M64 segment size, just round up
>>> - * power of two.
>>> + * Generally, one segmented M64 BAR maps one IOV BAR. However,
>>> + * if a VF BAR is too large we end up wasting a lot of space.
>>> + * If we've got a BAR that's bigger than greater than 1/4 of the
>>
>>
>> bigger, greater, huger? :)
>>
>> Also, a nit: s/got a BAR/got a VF BAR/
>
> whatever, it's just words
You are talking about these BARs and those BARs and since we want "to
help out
the next sucker^Wperson who needs to tinker with it", using precise term
is kinda essential here.
>
>>> + * default window's segment size then switch to using single PE
>>> + * windows. This limits the total number of VFs we can support.
>>
>> Just to get idea about absolute numbers here.
>>
>> On my P9:
>>
>> ./pciex@600c3c0300000/ibm,opal-m64-window
>> 00060200 00000000 00060200 00000000 00000040 00000000
>>
>> so that default window's segment size is 0x40.0000.0000/512 = 512MB?
>
> Yeah. It'll vary a bit since PHB3 and some PHB4s have 256.
>
>>> *
>>> - * Generally, one M64 BAR maps one IOV BAR. To avoid conflict
>>> - * with other devices, IOV BAR size is expanded to be
>>> - * (total_pe * VF_BAR_size). When VF_BAR_size is half of M64
>>> - * segment size , the expanded size would equal to half of the
>>> - * whole M64 space size, which will exhaust the M64 Space and
>>> - * limit the system flexibility. This is a design decision to
>>> - * set the boundary to quarter of the M64 segment size.
>>> + * The 1/4 limit is arbitrary and can be tweaked.
>>> */
>>> - if (total_vf_bar_sz > gate) {
>>> - mul = roundup_pow_of_two(total_vfs);
>>> - dev_info(&pdev->dev,
>>> - "VF BAR Total IOV size %llx > %llx, roundup to %d VFs\n",
>>> - total_vf_bar_sz, gate, mul);
>>> - iov->m64_single_mode = true;
>>> - break;
>>> - }
>>> - }
>>> + if (vf_bar_sz > (phb->ioda.m64_segsize >> 2)) {
>>> + /*
>>> + * On PHB3, the minimum size alignment of M64 BAR in
>>> + * single mode is 32MB. If this VF BAR is smaller than
>>> + * 32MB, but still too large for a segmented window
>>> + * then we can't map it and need to disable SR-IOV for
>>> + * this device.
>>
>>
>> Why not use single PE mode for such BAR? Better than nothing.
>
> Suppose you could, but I figured VFs were mainly interesting since you
> could give each VF to a separate guest. If there's multiple VFs under
> the same single PE BAR then they'd have to be assigned to the same
True. But with one PE per VF we can still have 15 (or 14?) isolated VFs
which is not hundreds but better than 0.
> guest in order to retain the freeze/unfreeze behaviour that PAPR
> requires. I guess that's how it used to work, but it seems better just
> to disable them rather than having VFs which sort of work.
Well, realistically the segment size should be 8MB to make this matter
(or the whole window 2GB) which does not seem to happen so it does not
matter.
--
Alexey
next prev parent reply other threads:[~2020-07-15 8:02 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-10 5:23 PowerNV PCI & SR-IOV cleanups Oliver O'Halloran
2020-07-10 5:23 ` [PATCH 01/15] powernv/pci: Add pci_bus_to_pnvhb() helper Oliver O'Halloran
2020-07-13 8:28 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 02/15] powerpc/powernv/pci: Always tear down DMA windows on PE release Oliver O'Halloran
2020-07-13 8:30 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 03/15] powerpc/powernv/pci: Add explicit tracking of the DMA setup state Oliver O'Halloran
2020-07-14 5:37 ` Alexey Kardashevskiy
2020-07-14 5:58 ` Oliver O'Halloran
2020-07-14 7:21 ` Alexey Kardashevskiy
2020-07-15 0:23 ` Alexey Kardashevskiy
2020-07-15 1:38 ` Oliver O'Halloran
2020-07-15 3:33 ` Alexey Kardashevskiy
2020-07-15 7:05 ` Cédric Le Goater
2020-07-15 9:00 ` Oliver O'Halloran
2020-07-15 10:05 ` Cédric Le Goater
2020-07-10 5:23 ` [PATCH 04/15] powerpc/powernv/pci: Initialise M64 for IODA1 as a 1-1 window Oliver O'Halloran
2020-07-14 7:39 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 05/15] powerpc/powernv/sriov: Move SR-IOV into a seperate file Oliver O'Halloran
2020-07-14 9:16 ` Alexey Kardashevskiy
2020-07-22 5:01 ` Oliver O'Halloran
2020-07-22 9:53 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 06/15] powerpc/powernv/sriov: Explain how SR-IOV works on PowerNV Oliver O'Halloran
2020-07-15 0:40 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 07/15] powerpc/powernv/sriov: Rename truncate_iov Oliver O'Halloran
2020-07-15 0:46 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 08/15] powerpc/powernv/sriov: Simplify used window tracking Oliver O'Halloran
2020-07-15 1:34 ` Alexey Kardashevskiy
2020-07-15 1:41 ` Oliver O'Halloran
2020-07-10 5:23 ` [PATCH 09/15] powerpc/powernv/sriov: Factor out M64 BAR setup Oliver O'Halloran
2020-07-15 2:09 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 10/15] powerpc/powernv/pci: Refactor pnv_ioda_alloc_pe() Oliver O'Halloran
2020-07-15 2:29 ` Alexey Kardashevskiy
2020-07-15 2:53 ` Oliver O'Halloran
2020-07-15 3:15 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 11/15] powerpc/powernv/sriov: Drop iov->pe_num_map[] Oliver O'Halloran
2020-07-15 3:31 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 12/15] powerpc/powernv/sriov: De-indent setup and teardown Oliver O'Halloran
2020-07-15 4:00 ` Alexey Kardashevskiy
2020-07-15 4:21 ` Oliver O'Halloran
2020-07-15 4:41 ` Alexey Kardashevskiy
2020-07-15 4:46 ` Oliver O'Halloran
2020-07-15 4:58 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 13/15] powerpc/powernv/sriov: Move M64 BAR allocation into a helper Oliver O'Halloran
2020-07-15 4:02 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 14/15] powerpc/powernv/sriov: Refactor M64 BAR setup Oliver O'Halloran
2020-07-15 4:50 ` Alexey Kardashevskiy
2020-07-10 5:23 ` [PATCH 15/15] powerpc/powernv/sriov: Make single PE mode a per-BAR setting Oliver O'Halloran
2020-07-15 5:24 ` Alexey Kardashevskiy
2020-07-15 6:16 ` Oliver O'Halloran
2020-07-15 8:00 ` Alexey Kardashevskiy [this message]
2020-07-22 5:39 ` Oliver O'Halloran
2020-07-22 10:06 ` Alexey Kardashevskiy
2020-07-24 3:40 ` Oliver O'Halloran
2020-07-10 6:45 ` PowerNV PCI & SR-IOV cleanups Christoph Hellwig
2020-07-10 12:45 ` Oliver O'Halloran
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=25d7fd88-668a-861e-a93c-3188caeac3cf@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=oohall@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).