All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel J Blueman <daniel@numascale.com>
To: Bjorn Helgaas <bhelgaas@google.com>,
	Ingo Molnar <mingo@redhat.com>,
	Jiang Liu <jiang.liu@linux.intel.com>,
	H Peter Anvin <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
	Steffen Persvold <sp@numascale.com>,
	"x86@kernel.org" <x86@kernel.org>
Subject: PCIe 32-bit MMIO exhaustion
Date: Wed, 28 Jan 2015 16:42:51 +0800	[thread overview]
Message-ID: <54C8A10B.3070207@numascale.com> (raw)

With systems with a large number of PCI devices, we're seeing lack of 
32-bit MMIO space, eg one quad-port NetXtreme-2 adapter takes 128MB of 
space [1].

An errata to the PCIe 2.1 spec provides guidance on limitations with 
64-bit non-prefetchable BARs (since bridges have only 32-bit 
non-prefetchable ranges) stating that vendors can enable the 
prefetchable bit in BARs under certain circumstances to allow 64-bit 
allocation [2].

The problem with that, is that vendors can't know apriori what hosts 
their products will be in, so can't just advertise prefetchable 64-bit 
BARs. What can be done, is system firmware can use the 64-bit 
prefetchable BAR in bridges, and assign a 64-bit non-prefetchable device 
BAR into that area, where it is safe to do so (following the guidance).

At present, linux denies such allocations [3] and disables the BARs. It 
seems a practical solution to allow them if the firmware believes it is 
safe.

Is this plausible?

Thanks,
   Daniel

--- [1]

0000:01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II 
BCM5709 Gigabit Ethernet (rev 20)
	Subsystem: Dell Device 1f26
	Flags: bus master, fast devsel, latency 0, IRQ 24
	Memory at e6000000 (64-bit, non-prefetchable) [size=32M]
	Capabilities: [48] Power Management version 3
	Capabilities: [50] Vital Product Data
	Capabilities: [58] MSI: Enable- Count=1/16 Maskable- 64bit+
	Capabilities: [a0] MSI-X: Enable+ Count=9 Masked-
	Capabilities: [ac] Express Endpoint, MSI 00
	Capabilities: [100] Device Serial Number d4-ae-52-ff-fe-ea-5c-e8
	Capabilities: [110] Advanced Error Reporting
	Capabilities: [150] Power Budgeting <?>
	Capabilities: [160] Virtual Channel
	Kernel driver in use: bnx2

0000:01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II 
BCM5709 Gigabit Ethernet (rev 20)
	Subsystem: Dell Device 1f26
	Flags: bus master, fast devsel, latency 0, IRQ 25
	Memory at e8000000 (64-bit, non-prefetchable) [size=32M]
	Capabilities: [48] Power Management version 3
	Capabilities: [50] Vital Product Data
	Capabilities: [58] MSI: Enable- Count=1/16 Maskable- 64bit+
	Capabilities: [a0] MSI-X: Enable- Count=9 Masked-
	Capabilities: [ac] Express Endpoint, MSI 00
	Capabilities: [100] Device Serial Number d4-ae-52-ff-fe-ea-5c-ea
	Capabilities: [110] Advanced Error Reporting
	Capabilities: [150] Power Budgeting <?>
	Capabilities: [160] Virtual Channel
	Kernel driver in use: bnx2

0000:02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II 
BCM5709 Gigabit Ethernet (rev 20)
	Subsystem: Dell Device 1f26
	Flags: bus master, fast devsel, latency 0, IRQ 28
	Memory at ea000000 (64-bit, non-prefetchable) [size=32M]
	Capabilities: [48] Power Management version 3
	Capabilities: [50] Vital Product Data
	Capabilities: [58] MSI: Enable- Count=1/16 Maskable- 64bit+
	Capabilities: [a0] MSI-X: Enable- Count=9 Masked-
	Capabilities: [ac] Express Endpoint, MSI 00
	Capabilities: [100] Device Serial Number d4-ae-52-ff-fe-ea-5c-ec
	Capabilities: [110] Advanced Error Reporting
	Capabilities: [150] Power Budgeting <?>
	Capabilities: [160] Virtual Channel
	Kernel driver in use: bnx2

0000:02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II 
BCM5709 Gigabit Ethernet (rev 20)
	Subsystem: Dell Device 1f26
	Flags: bus master, fast devsel, latency 0, IRQ 29
	Memory at ec000000 (64-bit, non-prefetchable) [size=32M]
	Capabilities: [48] Power Management version 3
	Capabilities: [50] Vital Product Data
	Capabilities: [58] MSI: Enable- Count=1/16 Maskable- 64bit+
	Capabilities: [a0] MSI-X: Enable- Count=9 Masked-
	Capabilities: [ac] Express Endpoint, MSI 00
	Capabilities: [100] Device Serial Number d4-ae-52-ff-fe-ea-5c-ee
	Capabilities: [110] Advanced Error Reporting
	Capabilities: [150] Power Budgeting <?>
	Capabilities: [160] Virtual Channel
	Kernel driver in use: bnx2

-- [2] p13

https://www.pcisig.com/specifications/pciexpress/base2/PCIe_Base_r2.1_Errata_08Jun10.pdf

-- [3]

pci 0002:01:00.0: BAR 0: [mem size 0x00002000 64bit] conflicts with PCI 
Bus 0002:00 [mem 0x10020000000-0x10027ffffff pref]
-- 
Daniel J Blueman
Principal Software Engineer, Numascale

             reply	other threads:[~2015-01-29  4:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-28  8:42 Daniel J Blueman [this message]
2015-01-29 15:23 ` PCIe 32-bit MMIO exhaustion Bjorn Helgaas
2015-02-24  4:37   ` Daniel J Blueman
2015-03-03 22:38     ` Bjorn Helgaas
2015-03-03 22:38       ` Bjorn Helgaas
2015-03-04  7:12       ` Daniel J Blueman
2015-03-04 17:01         ` Bjorn Helgaas
2015-03-19 15:04           ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54C8A10B.3070207@numascale.com \
    --to=daniel@numascale.com \
    --cc=bhelgaas@google.com \
    --cc=hpa@zytor.com \
    --cc=jiang.liu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=sp@numascale.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.