* Re: [PATCH 0/5]PCI: x86 MMCONFIG [not found] ` <fa.3T2SqNjavN55hanLOjr3RO+WalE@ifi.uio.no> @ 2007-12-20 0:12 ` Robert Hancock 0 siblings, 0 replies; 7+ messages in thread From: Robert Hancock @ 2007-12-20 0:12 UTC (permalink / raw) To: Greg KH; +Cc: tcamuso, linux-kernel, linux-pci, prarit Greg KH wrote: > On Wed, Dec 19, 2007 at 05:17:46PM -0500, tcamuso@redhat.com wrote: >> OVERVIEW >> ======= >> >> The patches should be applied in sequence to obviate any >> possible build problems. >> >> The patch-set was built against 2.6.24-rc5 >> >> Description >> =========== >> >> There exist devices that do not respond correctly to PCI >> MMCONFIG accesses in x86 platforms. > > What devices are these? Do you have reports of them somewhere? > >> This patch-set detects the problem by comparing an MMCONFIG >> read to a Legacy PCI config read of the vendor/device dword >> of every device discovered during the PCI probing sequence. >> >> A miscompare means that a device does not correctly respond >> to MMCONFIG accesses. When the patch code detects this condition, >> the bus that serves this device, and all subordinate buses, will >> be programmed to use Legacy PCI Config accesses. >> >> This patch-set DOES NOT detect devices that generate machine >> checks against MMCONFIG accesses. For such systems, >> "pci=nommconf" is required in the boot command. > > That sounds like this patchset can cause bad side affects on hardware > that currently works just fine. That is not a good thing to be adding > to the kernel, right? I think we need more details on why this patch is needed. Also, we already have something like this in arch/x86/pci/mmconfig-shared.c, in the unreachable_devices function. This attempts to detect devices where MMCONFIG cannot access the configuration space (one of these would be at least one device in the AMD K8 built-in northbridge). If this is not sufficient, I would suggest expanding that mechanism instead of adding all this new code. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from hancockr@nospamshaw.ca Home Page: http://www.roberthancock.com/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG] @ 2007-12-20 12:28 Tony Camuso 2007-12-20 17:22 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Tony Camuso @ 2007-12-20 12:28 UTC (permalink / raw) To: gregkh, linux-kernel, linux-pci -------- Original Message -------- Subject: Re: [PATCH 0/5]PCI: x86 MMCONFIG Date: Wed, 19 Dec 2007 19:33:45 -0500 From: Tony Camuso <tcamuso@redhat.com> Reply-To: tcamuso@redhat.com To: Greg KH <gregkh@suse.de> References: <20071219221746.20362.39243.sendpatchset@dhcp83-188.boston.redhat.com> <20071219231609.GE24219@suse.de> Greg KH wrote: > On Wed, Dec 19, 2007 at 05:17:46PM -0500, tcamuso@redhat.com wrote: >> There exist devices that do not respond correctly to PCI >> MMCONFIG accesses in x86 platforms. > > What devices are these? Do you have reports of them somewhere? > There are the AMD 8131 and 8132, the Serverworks HT1000 bridge chips and the 830M/MG graphics. Not all versions of these chips present this pathology, but there are perhaps tens of thousands of systems out there that have the broken versions of these chipsets. RedHat have been maintaining a blacklist of systems having these devices. Systems in the blacklist are confined to legacy PCI access. However, some of these systems, high volume ones at that, also have PCI express buses in them. By limiting the whole platform to legacy PCI access, we put PCI express extended config capabilities out of reach. IIRC, the PCIe spec requires platforms to access extended PCI config space. > That sounds like this patchset can cause bad side affects on hardware > that currently works just fine. That is not a good thing to be adding > to the kernel, right? > No, the patch set tries to obviate this without requiring endusers to write customized scripts with "pci=nommconf" and without requiring the RH folks to add another platform (usually belatedly) to the blacklist. If a device is going to machine check when you touch it with an mmconfig access, it will happen with or without this patch-set. However, the patch-set does cover most of the devices that don't respond well to mmconfig access. Such devices almost alway7s return garbage when you read from them. The one device we know about that throws exceptions is the 830M/MG graphics chip. This chip passes the read-compare test, so the code merrily advances to bus sizing. When the bus sizing code writes the BAR at offset 0x18 in this device, the system hangs. I am thinking about a machine check handler that can catch these things, or at least the exceptions. Aborts are not recoverable, according to intel lore. However, the one device that we know croaks HARD seems to throw an exception, since it happens in the same exact place every time. I think we might be able to recover from that. But that's in the future. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG] 2007-12-20 12:28 [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG] Tony Camuso @ 2007-12-20 17:22 ` Greg KH 2007-12-20 18:25 ` Tony Camuso 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2007-12-20 17:22 UTC (permalink / raw) To: Tony Camuso; +Cc: linux-kernel, linux-pci On Thu, Dec 20, 2007 at 07:28:00AM -0500, Tony Camuso wrote: > > > -------- Original Message -------- > Subject: Re: [PATCH 0/5]PCI: x86 MMCONFIG > Date: Wed, 19 Dec 2007 19:33:45 -0500 > From: Tony Camuso <tcamuso@redhat.com> > Reply-To: tcamuso@redhat.com > To: Greg KH <gregkh@suse.de> > References: > <20071219221746.20362.39243.sendpatchset@dhcp83-188.boston.redhat.com> > <20071219231609.GE24219@suse.de> > > Greg KH wrote: >> On Wed, Dec 19, 2007 at 05:17:46PM -0500, tcamuso@redhat.com wrote: >>> There exist devices that do not respond correctly to PCI >>> MMCONFIG accesses in x86 platforms. >> What devices are these? Do you have reports of them somewhere? > There are the AMD 8131 and 8132, the Serverworks HT1000 bridge chips > and the 830M/MG graphics. Not all versions of these chips present > this pathology, but there are perhaps tens of thousands of systems > out there that have the broken versions of these chipsets. Why haven't we gotten reports about this before if this is a common problem? And why hasn't the vendor fixed the bios on these to work properly? > RedHat have been maintaining a blacklist of systems having these > devices. Systems in the blacklist are confined to legacy PCI > access. Do you have a pointer to this blacklist anywhere so that everyone can benifit from this knowledge? >> That sounds like this patchset can cause bad side affects on hardware >> that currently works just fine. That is not a good thing to be adding >> to the kernel, right? > No, the patch set tries to obviate this without requiring endusers to > write customized scripts with "pci=nommconf" and without requiring the > RH folks to add another platform (usually belatedly) to the blacklist. > > If a device is going to machine check when you touch it with an mmconfig > access, it will happen with or without this patch-set. > > However, the patch-set does cover most of the devices that don't respond > well to mmconfig access. Such devices almost alway7s return garbage when > you read from them. > > The one device we know about that throws exceptions is the 830M/MG > graphics chip. This chip passes the read-compare test, so the code > merrily advances to bus sizing. When the bus sizing code writes the > BAR at offset 0x18 in this device, the system hangs. So it doesn't work at all, with or without this patch? Does the vendor know about this? thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG] 2007-12-20 17:22 ` Greg KH @ 2007-12-20 18:25 ` Tony Camuso 2007-12-20 21:57 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Tony Camuso @ 2007-12-20 18:25 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel, linux-pci Greg KH wrote: > Why haven't we gotten reports about this before if this is a common > problem? > > And why hasn't the vendor fixed the bios on these to work properly? > I can't really answer either of these questions. All I know is the problem exists, and we need to deal with it. That's why the unreachable_devices() routine exists in the first place. But that routine is limited to only the first 16 buses on segment-0. You would like to think that hardware designers would confine legacy hardware, or problematic hardware, to that area, but they don't. As I said before, expanding that routine to cover more buses would adversely impact mmconfig pci access, since the mmconfig access code does a lookup of that bitmap for every request. I don't think we want that bitmap and the accompanying in-line lookup to increase enough to encompass the pci configuration of some of these larger systems. > Do you have a pointer to this blacklist anywhere so that everyone can > benifit from this knowledge? > Appended below is a code snippet embedded in the RHEL version of mmconfig.c, for both i386 and x86_64. It does not encompass all the systems that have (or will have) problems with mmconf. Only HP platforms are listed, but I believe there are others. The reason the HP platforms got caught is they put these devices beyond bus 16, where they would have been trapped, and the problem would have been avoided. static int __devinit disable_mmconf(struct dmi_system_id *d) { pci_probe &= ~PCI_PROBE_MMCONF; printk(KERN_INFO "%s detected: disabling PCI MMCONFIG\n", d->ident); return 0; } /* * Systems which cannot use PCI MMCONFIG at this time... */ static struct dmi_system_id __devinitdata nommconf_dmi_table[] = { { .callback = disable_mmconf, .ident = "HP xw9300 Workstation", .matches = { DMI_MATCH(DMI_PRODUCT_NAME, "HP xw9300 Workstation"), }, }, { .callback = disable_mmconf, .ident = "HP xw9400 Workstation", .matches = { DMI_MATCH(DMI_PRODUCT_NAME, "HP xw9400 Workstation"), }, }, { .callback = disable_mmconf, .ident = "ProLiant DL585 G2", .matches = { DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant DL585 G2"), }, }, { .callback = disable_mmconf, .ident = "HP Compaq dc5700 Microtower", .matches = { DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq dc5700 Microtower"), }, }, {} }; >> The one device we know about that throws exceptions is the 830M/MG >> graphics chip. This chip passes the read-compare test, so the code >> merrily advances to bus sizing. When the bus sizing code writes the >> BAR at offset 0x18 in this device, the system hangs. > > So it doesn't work at all, with or without this patch? Does the vendor > know about this? > > thanks, > > greg k-h I have talked to intel about this, but they haven't gotten back to me. All I know at this point is that a mmconf write to the BAR at offset 0x18 of this device hangs the system. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG] 2007-12-20 18:25 ` Tony Camuso @ 2007-12-20 21:57 ` Greg KH 2007-12-20 22:36 ` Tony Camuso 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2007-12-20 21:57 UTC (permalink / raw) To: Tony Camuso; +Cc: linux-kernel, linux-pci On Thu, Dec 20, 2007 at 01:25:57PM -0500, Tony Camuso wrote: > Appended below is a code snippet embedded in the RHEL version of > mmconfig.c, > for both i386 and x86_64. It does not encompass all the systems that have > (or will have) problems with mmconf. Only HP platforms are listed, but I > believe there are others. > > The reason the HP platforms got caught is they put these devices beyond > bus 16, where they would have been trapped, and the problem would have > been avoided. Any reason why these changes were never submitted to the upstream kernel versions? Or do you all just want to keep patching your newer releases with this information forever? :) Please send these kinds of changes upstream... >>> The one device we know about that throws exceptions is the 830M/MG >>> graphics chip. This chip passes the read-compare test, so the code >>> merrily advances to bus sizing. When the bus sizing code writes the >>> BAR at offset 0x18 in this device, the system hangs. >> So it doesn't work at all, with or without this patch? Does the vendor >> know about this? >> thanks, >> greg k-h > > I have talked to intel about this, but they haven't gotten back to me. Try poking them harder, they are usually very responsive to Linux kernel issues. thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG] 2007-12-20 21:57 ` Greg KH @ 2007-12-20 22:36 ` Tony Camuso 2007-12-20 22:40 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Tony Camuso @ 2007-12-20 22:36 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel, linux-pci Greg KH wrote: > On Thu, Dec 20, 2007 at 01:25:57PM -0500, Tony Camuso wrote: > Any reason why these changes were never submitted to the upstream kernel > versions? Or do you all just want to keep patching your newer releases > with this information forever? :) > I really don't know why these changes were never submitted to the upstream kernel versions". I was brought on the scene about six months ago as HP's on-site engineer at RH, and this was one of the things they wanted me to do. We wanted a solution that was more generic and could manage this problem preemptively, rather than using blacklists. Maintenance of blacklists is a bother and almost always done after a new system with this problem is discovered. Furthermore, blacklisting whole platforms to use legacy pci config penalizes any mmconfig-friendly buses in those platforms, particularly the pci express buses, and causes such platforms to be non-compliant with the pci expres spec. > Please send these kinds of changes upstream... > As you wish. :) >> I have talked to intel about this, but they haven't gotten back to me. > > Try poking them harder, they are usually very responsive to Linux kernel > issues. > > thanks, > > greg k-h Matthew Wilcox explained the problem to me. The hang that cannot be fixed by my patch isn't the chipset's fault. The graphics device has a 64-bit BAR asking for 256 MB of IO. If I understand his explanation correctly, when the bus sizing code writes the low dword of this BAR with 0xffffffff, the chip is then programmed to claim every MMIO reference betweeen 0x00000000.f0000000 and 0x00000001.00000000 It just so happens that MMCONFIG space for the dc5700 (awa some others) is mapped by its BIOS into that very region, so all MMCONFIG references are now swallowed up by the graphics chip. At that point, no further boot progress can be made. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG] 2007-12-20 22:36 ` Tony Camuso @ 2007-12-20 22:40 ` Greg KH 2008-01-08 3:20 ` [PATCH 0/5]PCI: x86 MMCONFIG Tony Camuso 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2007-12-20 22:40 UTC (permalink / raw) To: Tony Camuso; +Cc: linux-kernel, linux-pci On Thu, Dec 20, 2007 at 05:36:43PM -0500, Tony Camuso wrote: > Greg KH wrote: >> On Thu, Dec 20, 2007 at 01:25:57PM -0500, Tony Camuso wrote: >> Any reason why these changes were never submitted to the upstream kernel >> versions? Or do you all just want to keep patching your newer releases >> with this information forever? :) > > I really don't know why these changes were never submitted to the > upstream kernel versions". > > I was brought on the scene about six months ago as HP's on-site engineer > at RH, and this was one of the things they wanted me to do. > > We wanted a solution that was more generic and could manage this > problem preemptively, rather than using blacklists. Maintenance of > blacklists is a bother and almost always done after a new system > with this problem is discovered. > > Furthermore, blacklisting whole platforms to use legacy pci config > penalizes any mmconfig-friendly buses in those platforms, particularly > the pci express buses, and causes such platforms to be non-compliant > with the pci expres spec. Sure, I realize this, but it solves the problem in one way for broken hardware, such that it at least allows it to work, right? It also provides a better incentive for the manufacturer to fix their bios, which as you are on-site at HP, it would seem odd that they would just not do that instead of trying to work around this in the kernel... thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/5]PCI: x86 MMCONFIG 2007-12-20 22:40 ` Greg KH @ 2008-01-08 3:20 ` Tony Camuso 2008-01-08 4:56 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Tony Camuso @ 2008-01-08 3:20 UTC (permalink / raw) To: Greg KH Cc: linux-kernel, linux-pci, prarit, Nagananda Chumbalkar, Pat Schoeller Greg, Have you given this patch-set any more consideration? I've submitted the changes you requested. Regards, Tony ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/5]PCI: x86 MMCONFIG 2008-01-08 3:20 ` [PATCH 0/5]PCI: x86 MMCONFIG Tony Camuso @ 2008-01-08 4:56 ` Greg KH 2008-01-08 13:14 ` Tony Camuso 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2008-01-08 4:56 UTC (permalink / raw) To: Tony Camuso Cc: Greg KH, linux-kernel, linux-pci, prarit, Nagananda Chumbalkar, Pat Schoeller On Mon, Jan 07, 2008 at 10:20:23PM -0500, Tony Camuso wrote: > Greg, > > Have you given this patch-set any more consideration? Which patch-set, there have been a number of them :) > I've submitted the changes you requested. Care to respin them all so I'm not confused? thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/5]PCI: x86 MMCONFIG 2008-01-08 4:56 ` Greg KH @ 2008-01-08 13:14 ` Tony Camuso 2008-01-08 13:36 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Tony Camuso @ 2008-01-08 13:14 UTC (permalink / raw) To: Greg KH Cc: Greg KH, linux-kernel, linux-pci, prarit, Nagananda Chumbalkar, Pat Schoeller I'll respin the whole kit with [PATCH ?/5][V2]PCI: x86 MMCONFIG * in the subject line, where the ? is replaced by the number of the patch within the set and the * is replaced by a brief description, if that's acceptable. I can have it ready in a few hours. On Mon, 2008-01-07 at 20:56 -0800, Greg KH wrote: > On Mon, Jan 07, 2008 at 10:20:23PM -0500, Tony Camuso wrote: > > Greg, > > > > Have you given this patch-set any more consideration? > > Which patch-set, there have been a number of them :) > > > I've submitted the changes you requested. > > Care to respin them all so I'm not confused? > > thanks, > > greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/5]PCI: x86 MMCONFIG 2008-01-08 13:14 ` Tony Camuso @ 2008-01-08 13:36 ` Greg KH 2008-01-08 13:44 ` Tony Camuso 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2008-01-08 13:36 UTC (permalink / raw) To: Tony Camuso Cc: Greg KH, linux-kernel, linux-pci, prarit, Nagananda Chumbalkar, Pat Schoeller On Tue, Jan 08, 2008 at 08:14:22AM -0500, Tony Camuso wrote: > I'll respin the whole kit with [PATCH ?/5][V2]PCI: x86 MMCONFIG * > in the subject line, where the ? is replaced by the number of the > patch within the set and the * is replaced by a brief description, > if that's acceptable. That sounds great. > I can have it ready in a few hours. I'll be on a plane for a few hours, so don't feel the need to rush :) thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/5]PCI: x86 MMCONFIG 2008-01-08 13:36 ` Greg KH @ 2008-01-08 13:44 ` Tony Camuso 0 siblings, 0 replies; 7+ messages in thread From: Tony Camuso @ 2008-01-08 13:44 UTC (permalink / raw) To: Greg KH Cc: Greg KH, linux-kernel, linux-pci, prarit, Nagananda Chumbalkar, Pat Schoeller Thanks, Greg. Have a safe flight! On Tue, 2008-01-08 at 05:36 -0800, Greg KH wrote: > On Tue, Jan 08, 2008 at 08:14:22AM -0500, Tony Camuso wrote: > > I'll respin the whole kit with [PATCH ?/5][V2]PCI: x86 MMCONFIG * > > in the subject line, where the ? is replaced by the number of the > > patch within the set and the * is replaced by a brief description, > > if that's acceptable. > > That sounds great. > > > I can have it ready in a few hours. > > I'll be on a plane for a few hours, so don't feel the need to rush :) > > thanks, > > greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20071219221746.20362.39243.sendpatchset@dhcp83-188.boston.redhat.com>]
* Re: [PATCH 0/5]PCI: x86 MMCONFIG [not found] <20071219221746.20362.39243.sendpatchset@dhcp83-188.boston.redhat.com> @ 2007-12-19 23:16 ` Greg KH 0 siblings, 0 replies; 7+ messages in thread From: Greg KH @ 2007-12-19 23:16 UTC (permalink / raw) To: tcamuso; +Cc: linux-kernel, linux-pci, prarit On Wed, Dec 19, 2007 at 05:17:46PM -0500, tcamuso@redhat.com wrote: > OVERVIEW > ======= > > The patches should be applied in sequence to obviate any > possible build problems. > > The patch-set was built against 2.6.24-rc5 > > Description > =========== > > There exist devices that do not respond correctly to PCI > MMCONFIG accesses in x86 platforms. What devices are these? Do you have reports of them somewhere? > This patch-set detects the problem by comparing an MMCONFIG > read to a Legacy PCI config read of the vendor/device dword > of every device discovered during the PCI probing sequence. > > A miscompare means that a device does not correctly respond > to MMCONFIG accesses. When the patch code detects this condition, > the bus that serves this device, and all subordinate buses, will > be programmed to use Legacy PCI Config accesses. > > This patch-set DOES NOT detect devices that generate machine > checks against MMCONFIG accesses. For such systems, > "pci=nommconf" is required in the boot command. That sounds like this patchset can cause bad side affects on hardware that currently works just fine. That is not a good thing to be adding to the kernel, right? thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-01-08 13:44 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <fa.zIbPFbLub6ANMT5vMxS6hx+dfv0@ifi.uio.no>
[not found] ` <fa.3T2SqNjavN55hanLOjr3RO+WalE@ifi.uio.no>
2007-12-20 0:12 ` [PATCH 0/5]PCI: x86 MMCONFIG Robert Hancock
2007-12-20 12:28 [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG] Tony Camuso
2007-12-20 17:22 ` Greg KH
2007-12-20 18:25 ` Tony Camuso
2007-12-20 21:57 ` Greg KH
2007-12-20 22:36 ` Tony Camuso
2007-12-20 22:40 ` Greg KH
2008-01-08 3:20 ` [PATCH 0/5]PCI: x86 MMCONFIG Tony Camuso
2008-01-08 4:56 ` Greg KH
2008-01-08 13:14 ` Tony Camuso
2008-01-08 13:36 ` Greg KH
2008-01-08 13:44 ` Tony Camuso
[not found] <20071219221746.20362.39243.sendpatchset@dhcp83-188.boston.redhat.com>
2007-12-19 23:16 ` Greg KH
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).