From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>, linuxppc-dev@lists.ozlabs.org
Cc: bhelgaas@google.com, linux-pci@vger.kernel.org
Subject: Re: [PATCH v4 03/21] powerpc/powernv: M64 support improvement
Date: Sat, 09 May 2015 20:24:14 +1000 [thread overview]
Message-ID: <554DE04E.8080900@ozlabs.ru> (raw)
In-Reply-To: <1430460188-31343-4-git-send-email-gwshan@linux.vnet.ibm.com>
On 05/01/2015 04:02 PM, Gavin Shan wrote:
> We're having the hardware or enforced (on P7IOC) limitation: M64
I would think if it is enforced, then it is enforced by hardware but you
say "hardware OR enforced" :)
> segment#x can only be assigned to PE#x. IO and M32 segment can be
> mapped to arbitrary PE# via IODT and M32DT. It means the PE number
> should be x if M64 segment#x has been assigned to the PE. Also, each
> PE own one M64 segment at most. Currently, we are reserving PE#
> according to root port's M64 window. It won't be reliable once we
> extend M64 windows of root port, or the upstream port of the PCIE
> switch behind root port to PHB's M64 window, in order to support
> PCI hotplug in future.
>
> The patch reserves PE# for M64 segments according to the M64 resources
> of the PCI devices (not bridges) contained in the PE. Besides, it's
> always worthy to trace the M64 segments consumed by the PE, which can
> be released at PCI unplugging time.
>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
> arch/powerpc/platforms/powernv/pci-ioda.c | 190 ++++++++++++++++++------------
> arch/powerpc/platforms/powernv/pci.h | 10 +-
> 2 files changed, 122 insertions(+), 78 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 646962f..a994882 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -283,28 +283,78 @@ fail:
> return -EIO;
> }
>
> -static void pnv_ioda_reserve_m64_pe(struct pnv_phb *phb)
> +/* We extend the M64 window of root port, or the upstream bridge port
> + * of the PCIE switch behind root port. So we shouldn't reserve PEs
> + * for M64 resources because there are no (normal) PCI devices consuming
"PCI devices"? Not "root ports or PCI bridges"?
> + * M64 resources on the PCI buses leading from root port, or the upstream
> + * bridge port.The function returns true if the indicated PCI bus needs
> + * reserved PEs because of M64 resources in advance. Otherwise, the
> + * function returns false.
> + */
> +static bool pnv_ioda_need_m64_pe(struct pnv_phb *phb,
> + struct pci_bus *bus)
> {
> - resource_size_t sgsz = phb->ioda.m64_segsize;
> + /* Root bus */
The comment is too obvious as the call below is called "pci_is_root_bus" :)
> + if (!bus || pci_is_root_bus(bus))
> + return false;
> +
> + /* Bus leading from root port. We need check what types of PCI
> + * devices on the bus. If it's connecting PCI bridge, we don't
> + * need reserve M64 PEs for it. Otherwise, we still need to do
> + * that.
> + */
> + if (pci_is_root_bus(bus->self->bus)) {
> + struct pci_dev *pdev;
> +
> + list_for_each_entry(pdev, &bus->devices, bus_list) {
> + if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL)
> + return true;
> + }
> +
> + return false;
> + }
> +
> + /* Bus leading from the upstream bridge port on top level */
> + if (pci_is_root_bus(bus->self->bus->self->bus))
Is it for second level bridges? Like root->bridge->bridge? And for 3 levels
you will need a PE?
> + return false;
> +
> + return true;
> +}
> +
> +static void pnv_ioda_reserve_m64_pe(struct pnv_phb *phb,
> + struct pci_bus *bus)
> +{
> + resource_size_t segsz = phb->ioda.m64_segsize;
> struct pci_dev *pdev;
> struct resource *r;
> - int base, step, i;
> + unsigned long pe_no, limit;
> + int i;
>
> - /*
> - * Root bus always has full M64 range and root port has
> - * M64 range used in reality. So we're checking root port
> - * instead of root bus.
> + if (!pnv_ioda_need_m64_pe(phb, bus))
> + return;
> +
> + /* The bridge's M64 window might have been extended to the
> + * PHB's M64 window in order to support PCI hotplug. So the
> + * bridge's M64 window isn't reliable to be used for picking
> + * PE# for its leading PCI bus. We have to check the M64
> + * resources consumed by the PCI devices, which seat on the
> + * PCI bus.
> */
> - list_for_each_entry(pdev, &phb->hose->bus->devices, bus_list) {
> - for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
> - r = &pdev->resource[PCI_BRIDGE_RESOURCES + i];
> - if (!r->parent ||
> - !pnv_pci_is_mem_pref_64(r->flags))
> + list_for_each_entry(pdev, &bus->devices, bus_list) {
> + for (i = 0; i < PCI_NUM_RESOURCES; i++) {
> +#ifdef CONFIG_PCI_IOV
> + if (i >= PCI_IOV_RESOURCES && i <= PCI_IOV_RESOURCE_END)
> + continue;
> +#endif
> + r = &pdev->resource[i];
> + if (!r->flags || r->start >= r->end ||
> + !r->parent || !pnv_pci_is_mem_pref_64(r->flags))
> continue;
>
> - base = (r->start - phb->ioda.m64_base) / sgsz;
> - for (step = 0; step < resource_size(r) / sgsz; step++)
> - pnv_ioda_reserve_pe(phb, base + step);
> + pe_no = (r->start - phb->ioda.m64_base) / segsz;
> + limit = ALIGN(r->end - phb->ioda.m64_base, segsz) / segsz;
> + for (; pe_no < limit; pe_no++)
> + pnv_ioda_reserve_pe(phb, pe_no);
> }
> }
> }
> @@ -316,85 +366,64 @@ static int pnv_ioda_pick_m64_pe(struct pnv_phb *phb,
> struct pci_dev *pdev;
> struct resource *r;
> struct pnv_ioda_pe *master_pe, *pe;
> - unsigned long size, *pe_alloc;
> - bool found;
> - int start, i, j;
> -
> - /* Root bus shouldn't use M64 */
> - if (pci_is_root_bus(bus))
> - return IODA_INVALID_PE;
> -
> - /* We support only one M64 window on each bus */
> - found = false;
> - pci_bus_for_each_resource(bus, r, i) {
> - if (r && r->parent &&
> - pnv_pci_is_mem_pref_64(r->flags)) {
> - found = true;
> - break;
> - }
> - }
> + unsigned long size, *pe_bitsmap;
s/pe_bitsmap/pe_bitmap/
> + unsigned long pe_no, limit;
> + int i;
>
> - /* No M64 window found ? */
> - if (!found)
> + if (!pnv_ioda_need_m64_pe(phb, bus))
> return IODA_INVALID_PE;
>
> - /* Allocate bitmap */
> + /* Allocate bitmap */
> size = _ALIGN_UP(phb->ioda.total_pe / 8, sizeof(unsigned long));
> - pe_alloc = kzalloc(size, GFP_KERNEL);
> - if (!pe_alloc) {
> - pr_warn("%s: Out of memory !\n",
> - __func__);
> + pe_bitsmap = kzalloc(size, GFP_KERNEL);
> + if (!pe_bitsmap) {
> + pr_warn("%s: Out of memory !\n", __func__);
> return IODA_INVALID_PE;
> }
>
> - /*
> - * Figure out reserved PE numbers by the PE
> - * the its child PEs.
> - */
> - start = (r->start - phb->ioda.m64_base) / segsz;
> - for (i = 0; i < resource_size(r) / segsz; i++)
> - set_bit(start + i, pe_alloc);
> -
> - if (all)
> - goto done;
> -
> - /*
> - * If the PE doesn't cover all subordinate buses,
> - * we need subtract from reserved PEs for children.
> + /* The bridge's M64 window might be extended to PHB's M64
> + * window by intention to support PCI hotplug. So we have
> + * to check the M64 resources consumed by the PCI devices
> + * on the PCI bus.
> */
> list_for_each_entry(pdev, &bus->devices, bus_list) {
> - if (!pdev->subordinate)
> - continue;
> + for (i = 0; i < PCI_NUM_RESOURCES; i++) {
> +#ifdef CONFIG_PCI_IOV
> + if (i >= PCI_IOV_RESOURCES &&
> + i <= PCI_IOV_RESOURCE_END)
> + continue;
> +#endif
> + /* Don't scan bridge's window if the PE
> + * doesn't contain its subordinate bus.
> + */
> + if (!all && i >= PCI_BRIDGE_RESOURCES &&
> + i <= PCI_BRIDGE_RESOURCE_END)
> + continue;
>
> - pci_bus_for_each_resource(pdev->subordinate, r, i) {
> - if (!r || !r->parent ||
> - !pnv_pci_is_mem_pref_64(r->flags))
> + r = &pdev->resource[i];
> + if (!r->flags || r->start >= r->end ||
> + !r->parent || !pnv_pci_is_mem_pref_64(r->flags))
> continue;
>
> - start = (r->start - phb->ioda.m64_base) / segsz;
> - for (j = 0; j < resource_size(r) / segsz ; j++)
> - clear_bit(start + j, pe_alloc);
> - }
> - }
> + pe_no = (r->start - phb->ioda.m64_base) / segsz;
> + limit = ALIGN(r->end - phb->ioda.m64_base, segsz) / segsz;
> + for (; pe_no < limit; pe_no++)
> + set_bit(pe_no, pe_bitsmap);
> + }
> + }
>
> - /*
> - * the current bus might not own M64 window and that's all
> - * contributed by its child buses. For the case, we needn't
> - * pick M64 dependent PE#.
> - */
> - if (bitmap_empty(pe_alloc, phb->ioda.total_pe)) {
> - kfree(pe_alloc);
> + /* No M64 window found ? */
> + if (bitmap_empty(pe_bitsmap, phb->ioda.total_pe)) {
> + kfree(pe_bitsmap);
> return IODA_INVALID_PE;
> }
>
> - /*
> - * Figure out the master PE and put all slave PEs to master
> - * PE's list to form compound PE.
> + /* Figure out the master PE and put all slave PEs
> + * to master PE's list to form compound PE.
> */
> -done:
> master_pe = NULL;
> i = -1;
> - while ((i = find_next_bit(pe_alloc, phb->ioda.total_pe, i + 1)) <
> + while ((i = find_next_bit(pe_bitsmap, phb->ioda.total_pe, i + 1)) <
> phb->ioda.total_pe) {
> pe = &phb->ioda.pe_array[i];
>
> @@ -408,6 +437,13 @@ done:
> list_add_tail(&pe->list, &master_pe->slaves);
> }
>
> + /* Pick the M64 segment, which should be available. Also,
test_and_set_bit() does not pick or choose, it just marks PE#pe_number used.
> + * those M64 segments consumed by slave PEs are contributed
> + * to the master PE.
> + */
> + BUG_ON(test_and_set_bit(pe->pe_number, phb->ioda.m64_segmap));
> + BUG_ON(test_and_set_bit(pe->pe_number, master_pe->m64_segmap));
> +
> /* P7IOC supports M64DT, which helps mapping M64 segment
> * to one particular PE#. Unfortunately, PHB3 has fixed
> * mapping between M64 segment and PE#. In order for same
> @@ -431,7 +467,7 @@ done:
> }
> }
>
> - kfree(pe_alloc);
> + kfree(pe_bitsmap);
> return master_pe->pe_number;
> }
>
> @@ -1233,7 +1269,7 @@ static void pnv_pci_ioda_setup_PEs(void)
>
> /* M64 layout might affect PE allocation */
> if (phb->reserve_m64_pe)
> - phb->reserve_m64_pe(phb);
> + phb->reserve_m64_pe(phb, phb->hose->bus);
>
> pnv_ioda_setup_PEs(hose->bus);
> }
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 070ee88..19022cf 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -49,6 +49,13 @@ struct pnv_ioda_pe {
> /* PE number */
> unsigned int pe_number;
>
> + /* IO/M32/M64 segments consumed by the PE. Each PE can
> + * have one M64 segment at most, but M64 segments consumed
> + * by slave PEs will be contributed to the master PE. One
> + * PE can own multiple IO and M32 segments.
> + */
> + unsigned long m64_segmap[8];
Why 8? 64*8 = 512 segments? s'8'512/sizeof(unsigned long)' may be?
> +
> /* "Weight" assigned to the PE for the sake of DMA resource
> * allocations
> */
> @@ -114,7 +121,7 @@ struct pnv_phb {
> u32 (*bdfn_to_pe)(struct pnv_phb *phb, struct pci_bus *bus, u32 devfn);
> void (*shutdown)(struct pnv_phb *phb);
> int (*init_m64)(struct pnv_phb *phb);
> - void (*reserve_m64_pe)(struct pnv_phb *phb);
> + void (*reserve_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus);
> int (*pick_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus, int all);
> int (*get_pe_state)(struct pnv_phb *phb, int pe_no);
> void (*freeze_pe)(struct pnv_phb *phb, int pe_no);
> @@ -153,6 +160,7 @@ struct pnv_phb {
> struct mutex pe_alloc_mutex;
>
> /* M32 & IO segment maps */
> + unsigned long m64_segmap[8];
> unsigned int *m32_segmap;
> unsigned int *io_segmap;
> struct pnv_ioda_pe *pe_array;
>
--
Alexey
next prev parent reply other threads:[~2015-05-09 10:24 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-01 6:02 [PATCH v4 00/21] PowerPC/PowerNV: PCI Slot Management Gavin Shan
2015-05-01 6:02 ` [PATCH v4 01/21] pci: Add pcibios_setup_bridge() Gavin Shan
2015-05-07 22:12 ` Bjorn Helgaas
2015-05-11 1:59 ` Gavin Shan
2015-05-01 6:02 ` [PATCH v4 02/21] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
2015-05-09 0:18 ` Alexey Kardashevskiy
2015-05-11 4:37 ` Gavin Shan
2015-05-01 6:02 ` [PATCH v4 03/21] powerpc/powernv: M64 support improvement Gavin Shan
2015-05-09 10:24 ` Alexey Kardashevskiy [this message]
2015-05-11 4:47 ` Gavin Shan
2015-05-01 6:02 ` [PATCH v4 04/21] powerpc/powernv: Improve IO and M32 mapping Gavin Shan
2015-05-09 10:53 ` Alexey Kardashevskiy
2015-05-11 4:52 ` Gavin Shan
2015-05-01 6:02 ` [PATCH v4 05/21] powerpc/powernv: Improve DMA32 segment assignment Gavin Shan
2015-05-01 6:02 ` [PATCH v4 06/21] powerpc/powernv: Create PEs dynamically Gavin Shan
2015-05-09 11:43 ` Alexey Kardashevskiy
2015-05-11 4:55 ` Gavin Shan
2015-05-01 6:02 ` [PATCH v4 07/21] powerpc/powernv: Release " Gavin Shan
2015-05-09 12:43 ` Alexey Kardashevskiy
2015-05-11 6:25 ` Gavin Shan
2015-05-11 7:02 ` Alexey Kardashevskiy
2015-05-12 0:03 ` Gavin Shan
2015-05-12 0:53 ` Alexey Kardashevskiy
2015-05-12 1:25 ` Gavin Shan
2015-05-01 6:02 ` [PATCH v4 08/21] powerpc/powernv: Drop pnv_ioda_setup_dev_PE() Gavin Shan
2015-05-09 12:45 ` Alexey Kardashevskiy
2015-05-01 6:02 ` [PATCH v4 09/21] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
2015-05-09 13:41 ` Alexey Kardashevskiy
2015-05-11 6:45 ` Gavin Shan
2015-05-11 7:16 ` Alexey Kardashevskiy
2015-05-01 6:02 ` [PATCH v4 10/21] powerpc/powernv: Fundamental reset for PCI bus reset Gavin Shan
2015-05-09 14:12 ` Alexey Kardashevskiy
2015-05-11 6:47 ` Gavin Shan
2015-05-11 7:17 ` Alexey Kardashevskiy
2015-05-12 0:04 ` Gavin Shan
2015-05-01 6:02 ` [PATCH v4 11/21] powerpc/pci: Don't scan empty slot Gavin Shan
2015-05-01 6:02 ` [PATCH v4 12/21] powerpc/pci: Move pcibios_find_pci_bus() around Gavin Shan
2015-05-01 6:03 ` [PATCH v4 13/21] powerpc/powernv: Introduce pnv_pci_poll() Gavin Shan
2015-05-09 14:30 ` Alexey Kardashevskiy
2015-05-11 7:19 ` Gavin Shan
2015-05-01 6:03 ` [PATCH v4 14/21] powerpc/powernv: Functions to get/reset PCI slot status Gavin Shan
2015-05-09 14:44 ` Alexey Kardashevskiy
2015-05-01 6:03 ` [PATCH v4 15/21] powerpc/pci: Delay creating pci_dn Gavin Shan
2015-05-09 14:55 ` Alexey Kardashevskiy
2015-05-11 7:21 ` Gavin Shan
2015-05-01 6:03 ` [PATCH v4 16/21] powerpc/pci: Create eeh_dev while " Gavin Shan
2015-05-09 15:08 ` Alexey Kardashevskiy
2015-05-11 7:24 ` Gavin Shan
2015-05-01 6:03 ` [PATCH v4 17/21] powerpc/pci: Export traverse_pci_device_nodes() Gavin Shan
2015-05-01 6:03 ` [PATCH v4 18/21] powerpc/pci: Update bridge windows on PCI plugging Gavin Shan
2015-05-01 6:03 ` [PATCH v4 19/21] drivers/of: Support adding sub-tree Gavin Shan
2015-05-01 12:54 ` Rob Herring
2015-05-01 15:22 ` Benjamin Herrenschmidt
2015-05-01 18:46 ` Rob Herring
2015-05-01 22:57 ` Benjamin Herrenschmidt
2015-05-01 23:29 ` Benjamin Herrenschmidt
2015-05-02 2:48 ` Benjamin Herrenschmidt
2015-05-04 1:30 ` Gavin Shan
2015-05-04 4:51 ` Benjamin Herrenschmidt
2015-05-04 0:23 ` Gavin Shan
2015-05-04 16:41 ` Pantelis Antoniou
2015-05-04 21:14 ` Benjamin Herrenschmidt
2015-05-13 23:35 ` Benjamin Herrenschmidt
2015-05-14 0:18 ` Rob Herring
2015-05-14 0:54 ` Benjamin Herrenschmidt
2015-05-14 6:23 ` Pantelis Antoniou
2015-05-14 6:46 ` Benjamin Herrenschmidt
2015-05-14 7:04 ` Pantelis Antoniou
2015-05-14 7:14 ` Benjamin Herrenschmidt
2015-05-14 7:19 ` Pantelis Antoniou
2015-05-14 7:25 ` Benjamin Herrenschmidt
2015-05-14 7:29 ` Benjamin Herrenschmidt
2015-05-14 7:34 ` Pantelis Antoniou
2015-05-14 7:47 ` Benjamin Herrenschmidt
2015-05-14 11:02 ` Pantelis Antoniou
2015-05-14 23:25 ` Benjamin Herrenschmidt
2015-06-07 7:54 ` Grant Likely
2015-06-08 20:57 ` Benjamin Herrenschmidt
2015-06-08 21:34 ` Grant Likely
2015-06-10 6:55 ` Gavin Shan
2015-05-03 23:28 ` Gavin Shan
2015-05-15 1:27 ` Gavin Shan
2015-05-01 6:03 ` [PATCH v4 20/21] powerpc/powernv: Select OF_DYNAMIC Gavin Shan
2015-05-01 6:03 ` [PATCH v4 21/21] pci/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
2015-05-09 15:54 ` Alexey Kardashevskiy
2015-05-11 7:38 ` Gavin Shan
2015-05-08 23:59 ` [PATCH v4 00/21] PowerPC/PowerNV: PCI Slot Management Alexey Kardashevskiy
2015-05-11 7:40 ` Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=554DE04E.8080900@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=bhelgaas@google.com \
--cc=gwshan@linux.vnet.ibm.com \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).