public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2.6.34-rcX] Do not expect PCI devices to return zeroes in PCIe space
@ 2010-05-01  2:54 Petr Vandrovec
  2010-05-04  6:21 ` Pan, Jacob jun
  2010-05-14 18:37 ` H. Peter Anvin
  0 siblings, 2 replies; 6+ messages in thread
From: Petr Vandrovec @ 2010-05-01  2:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: akpm, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Jacob Pan, Jesse Barnes

Hello,
  openSUSE11.3 32bit kernels hang when installed to the VMware's VMs because Moorestown
fixed capabilities detection code enters endless loop on Intel's AGP bridges (with
device ID=7191).  See https://bugzilla.kernel.org/show_bug.cgi?id=15888 for additional
details.  arch/x86/pci/mrst.c was introduced after 2.6.33, so only 2.6.34-rcX are
affected.
				Thanks,
					Petr Vandrovec


commit 11a35e56ad8275cbf62882d9c0dc2f17c2b5628b
Author: Petr Vandrovec <petr@vandrovec.name>
Date:   Fri Apr 30 19:17:43 2010 -0700

    Do not expect PCI devices to return zeroes in PCIe space

    There is no reason why old pre-PCIe/PCI-X devices should return zeroes when
    configuration space above 0x100 is accessed.  If these devices decode just
    low 8 bits of register number, conventional space repeats 15 times in
    PCIe config space.  And Moorestown parser for fixed bars then can enter
    endless loop when finding Intel AGP bridge device 0x7191 with secondary
    latency timer programmed to 0x40 - when such device is encountered, code
    will enter endless loop of reading registers 0x718 (reading 0x40010100)
    and 0x400 (reading 0x71918086).

    This change adds additional condition to the test: if device id/vendor
    match first PCIe capability, then device is not really PCIe.  It should
    not cause any problems: fixed_bar_cap is invoked only on Intel's devices,
    so only time there is possibilty to have false match would be if first
    PCIe capability would have ID 0x8086, and even then that address of
    next capability pointer and capability version will match device ID seems
    highly unlikely.

    This fix unbreaks 32bit 2.6.33+ kernels configured with Moorestown
    support to boot on AMD rev 10h+ processors under VMware in VMs which
    lack PCIe support.
    
    Signed-off-by: Petr Vandrovec <petr@vandrovec.name>

diff --git a/arch/x86/pci/mrst.c b/arch/x86/pci/mrst.c
index 8bf2fcb..cd6c277 100644
--- a/arch/x86/pci/mrst.c
+++ b/arch/x86/pci/mrst.c
@@ -55,12 +55,16 @@ static int fixed_bar_cap(struct pci_bus *bus, unsigned int devfn)
 {
 	int pos;
 	u32 pcie_cap = 0, cap_data;
+	u32 devid;
 
 	pos = PCIE_CAP_OFFSET;
 
 	if (!raw_pci_ext_ops)
 		return 0;
 
+	if (raw_pci_ops->read(pci_domain_nr(bus), bus->number, devfn, 0, 4, &devid))
+		return 0;
+
 	while (pos) {
 		if (raw_pci_ext_ops->read(pci_domain_nr(bus), bus->number,
 					  devfn, pos, 4, &pcie_cap))
@@ -69,6 +73,9 @@ static int fixed_bar_cap(struct pci_bus *bus, unsigned int devfn)
 		if (pcie_cap == 0xffffffff)
 			return 0;
 
+		if (pcie_cap == devid && pos == PCIE_CAP_OFFSET)
+			return 0;
+
 		if (PCI_EXT_CAP_ID(pcie_cap) == PCI_EXT_CAP_ID_VNDR) {
 			raw_pci_ext_ops->read(pci_domain_nr(bus), bus->number,
 					      devfn, pos + 4, 4, &cap_data);


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH 2.6.34-rcX] Do not expect PCI devices to return zeroes in PCIe space
  2010-05-01  2:54 [PATCH 2.6.34-rcX] Do not expect PCI devices to return zeroes in PCIe space Petr Vandrovec
@ 2010-05-04  6:21 ` Pan, Jacob jun
  2010-05-04  7:31   ` Petr Vandrovec
  2010-05-14 18:37 ` H. Peter Anvin
  1 sibling, 1 reply; 6+ messages in thread
From: Pan, Jacob jun @ 2010-05-04  6:21 UTC (permalink / raw)
  To: Petr Vandrovec, linux-kernel@vger.kernel.org
  Cc: akpm@linux-foundation.org, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86@kernel.org, Jesse Barnes, Pan, Jacob jun

Hi Petr,

There are other code in the kernel makes similar assumption of accessing pci cfg above 0x100. (but they do not hang in a loop)
e.g. in drivers/pci/probe.c
* accesses, or the device is behind a reverse Express bridge.  So we try
 * reading the dword at 0x100 which must either be 0 or a valid extended
 * capability header.
 */
int pci_cfg_space_size_ext(struct pci_dev *dev)
{
	u32 status;
	int pos = PCI_CFG_SPACE_SIZE;

	if (pci_read_config_dword(dev, pos, &status) != PCIBIOS_SUCCESSFUL)
		goto fail;
	if (status == 0xffffffff)
		goto fail;


Back to the problem itself, hpa has suggested a better fix might be using cfg_size for checking in fixed_bar_cap. But we can not use it right now since we have cfg_size set to 0x100 on MRST (due to lack of PCI_CAP_ID_EXP in the PCI shim). I will negotiate with FW guys so that we have the correct return from pci_cfg_space_size() for Moorestown.

Until then, your current fix should be good.

Thanks,

Jacob
> -----Original Message-----
> From: Petr Vandrovec [mailto:petr@vandrovec.name]
> Sent: Friday, April 30, 2010 7:54 PM
> To: linux-kernel@vger.kernel.org
> Cc: akpm@linux-foundation.org; Thomas Gleixner; Ingo Molnar; H. Peter
> Anvin; x86@kernel.org; Pan, Jacob jun; Jesse Barnes
> Subject: [PATCH 2.6.34-rcX] Do not expect PCI devices to return zeroes
> in PCIe space
> 
> Hello,
>   openSUSE11.3 32bit kernels hang when installed to the VMware's VMs
> because Moorestown
> fixed capabilities detection code enters endless loop on Intel's AGP
> bridges (with
> device ID=7191).  See https://bugzilla.kernel.org/show_bug.cgi?id=15888
> for additional
> details.  arch/x86/pci/mrst.c was introduced after 2.6.33, so only
> 2.6.34-rcX are
> affected.
> 				Thanks,
> 					Petr Vandrovec
> 
> 
> commit 11a35e56ad8275cbf62882d9c0dc2f17c2b5628b
> Author: Petr Vandrovec <petr@vandrovec.name>
> Date:   Fri Apr 30 19:17:43 2010 -0700
> 
>     Do not expect PCI devices to return zeroes in PCIe space
> 
>     There is no reason why old pre-PCIe/PCI-X devices should return
> zeroes when
>     configuration space above 0x100 is accessed.  If these devices
> decode just
>     low 8 bits of register number, conventional space repeats 15 times
> in
>     PCIe config space.  And Moorestown parser for fixed bars then can
> enter
>     endless loop when finding Intel AGP bridge device 0x7191 with
> secondary
>     latency timer programmed to 0x40 - when such device is encountered,
> code
>     will enter endless loop of reading registers 0x718 (reading
> 0x40010100)
>     and 0x400 (reading 0x71918086).
> 
>     This change adds additional condition to the test: if device
> id/vendor
>     match first PCIe capability, then device is not really PCIe.  It
> should
>     not cause any problems: fixed_bar_cap is invoked only on Intel's
> devices,
>     so only time there is possibilty to have false match would be if
> first
>     PCIe capability would have ID 0x8086, and even then that address of
>     next capability pointer and capability version will match device ID
> seems
>     highly unlikely.
> 
>     This fix unbreaks 32bit 2.6.33+ kernels configured with Moorestown
>     support to boot on AMD rev 10h+ processors under VMware in VMs
> which
>     lack PCIe support.
> 
>     Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
> 
> diff --git a/arch/x86/pci/mrst.c b/arch/x86/pci/mrst.c
> index 8bf2fcb..cd6c277 100644
> --- a/arch/x86/pci/mrst.c
> +++ b/arch/x86/pci/mrst.c
> @@ -55,12 +55,16 @@ static int fixed_bar_cap(struct pci_bus *bus,
> unsigned int devfn)
>  {
>  	int pos;
>  	u32 pcie_cap = 0, cap_data;
> +	u32 devid;
> 
>  	pos = PCIE_CAP_OFFSET;
> 
>  	if (!raw_pci_ext_ops)
>  		return 0;
> 
> +	if (raw_pci_ops->read(pci_domain_nr(bus), bus->number, devfn, 0,
> 4, &devid))
> +		return 0;
> +
>  	while (pos) {
>  		if (raw_pci_ext_ops->read(pci_domain_nr(bus), bus->number,
>  					  devfn, pos, 4, &pcie_cap))
> @@ -69,6 +73,9 @@ static int fixed_bar_cap(struct pci_bus *bus,
> unsigned int devfn)
>  		if (pcie_cap == 0xffffffff)
>  			return 0;
> 
> +		if (pcie_cap == devid && pos == PCIE_CAP_OFFSET)
> +			return 0;
> +
>  		if (PCI_EXT_CAP_ID(pcie_cap) == PCI_EXT_CAP_ID_VNDR) {
>  			raw_pci_ext_ops->read(pci_domain_nr(bus), bus-
> >number,
>  					      devfn, pos + 4, 4, &cap_data);


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2.6.34-rcX] Do not expect PCI devices to return zeroes in PCIe space
  2010-05-04  6:21 ` Pan, Jacob jun
@ 2010-05-04  7:31   ` Petr Vandrovec
  0 siblings, 0 replies; 6+ messages in thread
From: Petr Vandrovec @ 2010-05-04  7:31 UTC (permalink / raw)
  To: Pan, Jacob jun
  Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86@kernel.org,
	Jesse Barnes

On 05/03/10 23:21, Pan, Jacob jun wrote:
> Hi Petr,
>
> There are other code in the kernel makes similar assumption of accessing pci cfg above 0x100. (but they do not hang in a loop)
> e.g. in drivers/pci/probe.c
> * accesses, or the device is behind a reverse Express bridge.  So we try
>   * reading the dword at 0x100 which must either be 0 or a valid extended
>   * capability header.
>   */
> int pci_cfg_space_size_ext(struct pci_dev *dev)
> {
> 	u32 status;
> 	int pos = PCI_CFG_SPACE_SIZE;
>
> 	if (pci_read_config_dword(dev, pos,&status) != PCIBIOS_SUCCESSFUL)
> 		goto fail;
> 	if (status == 0xffffffff)
> 		goto fail;
>
>
> Back to the problem itself, hpa has suggested a better fix might be using cfg_size for checking in fixed_bar_cap. But we can not use it right now since we have cfg_size set to 0x100 on MRST (due to lack of PCI_CAP_ID_EXP in the PCI shim). I will negotiate with FW guys so that we have the correct return from pci_cfg_space_size() for Moorestown.
>
> Until then, your current fix should be good.

Thanks.  Other possibility would be to modify amd_bus.c to verify that
ENABLE_CF8_EXT_CFG bit in MSR_AMD64_NB_CFG is actually writeable, and
not set PCI_HAS_IO_ECS if bit is read-as-zero.  That would fix both
Moorestown code as well as pci_cfg_space_size_ext - patch below can be
applied instead of mrst.c changes.
							Petr


Do not report AMD processors in VMware as ECS capable

In a VM AMD processors do not have integrated northbridge, and so their
northbridge-related MSRs do not work, and do not enable PCIe configuration
space accesses via I/O ports 0xCF8/0xCFC.  Virtualized processor can be
detected by having NB_CFG register read-only.

Signed-off-by: Petr Vandrovec <petr@vandrovec.name>

diff --git a/arch/x86/pci/amd_bus.c b/arch/x86/pci/amd_bus.c
index fc1e8fe..cf03bff 100644
--- a/arch/x86/pci/amd_bus.c
+++ b/arch/x86/pci/amd_bus.c
@@ -349,6 +349,8 @@ static int __init early_fill_mp_bus_info(void)

  #define ENABLE_CF8_EXT_CFG      (1ULL << 46)

+static int ecs_ok = 1;
+
  static void enable_pci_io_ecs(void *unused)
  {
  	u64 reg;
@@ -356,6 +358,10 @@ static void enable_pci_io_ecs(void *unused)
  	if (!(reg & ENABLE_CF8_EXT_CFG)) {
  		reg |= ENABLE_CF8_EXT_CFG;
  		wrmsrl(MSR_AMD64_NB_CFG, reg);
+		/* VMware implements NB_CFG MSR as read-only.  Verify write worked... */
+		rdmsrl(MSR_AMD64_NB_CFG, reg);
+		if (!(reg & ENABLE_CF8_EXT_CFG))
+			ecs_ok = 0;
  	}
  }

@@ -390,7 +396,8 @@ static int __init pci_io_ecs_init(void)
  	for_each_online_cpu(cpu)
  		amd_cpu_notify(&amd_cpu_notifier, (unsigned long)CPU_ONLINE,
  			       (void *)(long)cpu);
-	pci_probe |= PCI_HAS_IO_ECS;
+	if (ecs_ok)
+		pci_probe |= PCI_HAS_IO_ECS;

  	return 0;
  }

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2.6.34-rcX] Do not expect PCI devices to return zeroes in PCIe space
  2010-05-01  2:54 [PATCH 2.6.34-rcX] Do not expect PCI devices to return zeroes in PCIe space Petr Vandrovec
  2010-05-04  6:21 ` Pan, Jacob jun
@ 2010-05-14 18:37 ` H. Peter Anvin
  2010-05-14 20:51   ` Petr Vandrovec
  1 sibling, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2010-05-14 18:37 UTC (permalink / raw)
  To: Petr Vandrovec
  Cc: linux-kernel, akpm, Thomas Gleixner, Ingo Molnar, x86, Jacob Pan,
	Jesse Barnes

[-- Attachment #1: Type: text/plain, Size: 593 bytes --]

On 04/30/2010 07:54 PM, Petr Vandrovec wrote:
> Hello,
>   openSUSE11.3 32bit kernels hang when installed to the VMware's VMs because Moorestown
> fixed capabilities detection code enters endless loop on Intel's AGP bridges (with
> device ID=7191).  See https://bugzilla.kernel.org/show_bug.cgi?id=15888 for additional
> details.  arch/x86/pci/mrst.c was introduced after 2.6.33, so only 2.6.34-rcX are
> affected.
> 				Thanks,
> 					Petr Vandrovec

Hi Petr,

Could you check if this patch fixes your problem, and if so let me know
as soon as possible?

Sorry for the delay.

Thanks,

	-hpa

[-- Attachment #2: mrst-pci.patch --]
[-- Type: text/x-patch, Size: 520 bytes --]

diff --git a/arch/x86/pci/mrst.c b/arch/x86/pci/mrst.c
index 8bf2fcb..1cdc02c 100644
--- a/arch/x86/pci/mrst.c
+++ b/arch/x86/pci/mrst.c
@@ -247,6 +247,10 @@ static void __devinit pci_fixed_bar_fixup(struct pci_dev *dev)
 	u32 size;
 	int i;
 
+	/* Must have extended configuration space */
+	if (dev->cfg_size < PCIE_CAP_OFFSET + 4)
+		return;
+
 	/* Fixup the BAR sizes for fixed BAR devices and make them unmoveable */
 	offset = fixed_bar_cap(dev->bus, dev->devfn);
 	if (!offset || PCI_DEVFN(2, 0) == dev->devfn ||

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2.6.34-rcX] Do not expect PCI devices to return zeroes in  PCIe space
  2010-05-14 18:37 ` H. Peter Anvin
@ 2010-05-14 20:51   ` Petr Vandrovec
  2010-05-14 21:39     ` [tip:x86/urgent] x86, mrst: Don't blindly access extended config space tip-bot for H. Peter Anvin
  0 siblings, 1 reply; 6+ messages in thread
From: Petr Vandrovec @ 2010-05-14 20:51 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: linux-kernel, akpm, Thomas Gleixner, Ingo Molnar, x86, Jacob Pan,
	Jesse Barnes, dcovelli

On Fri, May 14, 2010 at 11:37 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 04/30/2010 07:54 PM, Petr Vandrovec wrote:
>> Hello,
>>   openSUSE11.3 32bit kernels hang when installed to the VMware's VMs because Moorestown
>> fixed capabilities detection code enters endless loop on Intel's AGP bridges (with
>> device ID=7191).  See https://bugzilla.kernel.org/show_bug.cgi?id=15888 for additional
>> details.  arch/x86/pci/mrst.c was introduced after 2.6.33, so only 2.6.34-rcX are
>> affected.
>>                               Thanks,
>>                                       Petr Vandrovec

> Could you check if this patch fixes your problem, and if so let me know
> as soon as possible?
>
> Sorry for the delay.

Thanks for the fix.  Yes, it fixes hang too, and seems much nicer..

Petr

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip:x86/urgent] x86, mrst: Don't blindly access extended config space
  2010-05-14 20:51   ` Petr Vandrovec
@ 2010-05-14 21:39     ` tip-bot for H. Peter Anvin
  0 siblings, 0 replies; 6+ messages in thread
From: tip-bot for H. Peter Anvin @ 2010-05-14 21:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, petr, jbarnes, jacob.jun.pan, tglx, hpa

Commit-ID:  e9b1d5d0ff4d3ae86050dc4c91b3147361c7af9e
Gitweb:     http://git.kernel.org/tip/e9b1d5d0ff4d3ae86050dc4c91b3147361c7af9e
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 14 May 2010 13:55:57 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 14 May 2010 13:55:57 -0700

x86, mrst: Don't blindly access extended config space

Do not blindly access extended configuration space unless we actively
know we're on a Moorestown platform.  The fixed-size BAR capability
lives in the extended configuration space, and thus is not applicable
if the configuration space isn't appropriately sized.

This fixes booting certain VMware configurations with CONFIG_MRST=y.

Moorestown will add a fake PCI-X 266 capability to advertise the
presence of extended configuration space.

Reported-and-tested-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Jacob Pan <jacob.jun.pan@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
LKML-Reference: <AANLkTiltKUa3TrKR1M51eGw8FLNoQJSLT0k0_K5X3-OJ@mail.gmail.com>
---
 arch/x86/pci/mrst.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/x86/pci/mrst.c b/arch/x86/pci/mrst.c
index 8bf2fcb..1cdc02c 100644
--- a/arch/x86/pci/mrst.c
+++ b/arch/x86/pci/mrst.c
@@ -247,6 +247,10 @@ static void __devinit pci_fixed_bar_fixup(struct pci_dev *dev)
 	u32 size;
 	int i;
 
+	/* Must have extended configuration space */
+	if (dev->cfg_size < PCIE_CAP_OFFSET + 4)
+		return;
+
 	/* Fixup the BAR sizes for fixed BAR devices and make them unmoveable */
 	offset = fixed_bar_cap(dev->bus, dev->devfn);
 	if (!offset || PCI_DEVFN(2, 0) == dev->devfn ||

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-05-14 21:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-01  2:54 [PATCH 2.6.34-rcX] Do not expect PCI devices to return zeroes in PCIe space Petr Vandrovec
2010-05-04  6:21 ` Pan, Jacob jun
2010-05-04  7:31   ` Petr Vandrovec
2010-05-14 18:37 ` H. Peter Anvin
2010-05-14 20:51   ` Petr Vandrovec
2010-05-14 21:39     ` [tip:x86/urgent] x86, mrst: Don't blindly access extended config space tip-bot for H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox