* kexec boot regression
@ 2009-12-15 11:50 Jens Axboe
2009-12-15 12:01 ` Yinghai Lu
0 siblings, 1 reply; 42+ messages in thread
From: Jens Axboe @ 2009-12-15 11:50 UTC (permalink / raw)
To: Linux Kernel; +Cc: mingo, yinghai, rdreier
Hi,
I have this big box that takes forever to boot, so I use kexec to boot
into new kernels. Works fine, but some time past 2.6.32 it stopped
working. Instead of wasting brain cycles on finding out why, I handed
the problem to my trusty regression friend - git bisect.
This is what it found (sorry Yinghai it's you again, you owe me a beer
for hours of 2.6.32-git bisecting ;-)
99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit
commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d
Author: Yinghai Lu <yinghai@kernel.org>
Date: Sun Oct 4 21:54:24 2009 -0700
x86/PCI: read root resources from IOH on Intel
For intel systems with multi IOH, we should read peer root resources
directly from PCI config space, and don't trust _CRS.
I could not revert this single commit, as a further commit made other
changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed
that this kernel then works fine.
With current -git, I get tons and tons of:
[ 16.841724] pci 0000:00:01.0: BAR 7: no parent found for bridge [io
0x6000-0x6fff]
[ 16.850368] pci 0000:00:01.0: BAR 7: can't allocate [io
0x6000-0x6fff]
[ 16.857821] pci 0000:00:01.0: BAR 8: no parent found for bridge [mem
0x9bc00000-0x9bcfffff]
[ 16.867238] pci 0000:00:01.0: BAR 8: can't allocate [mem
0x9bc00000-0x9bcfffff]
[ 16.875492] pci 0000:00:02.0: BAR 7: no parent found for bridge [io
0x5000-0x5fff]
[ 16.884137] pci 0000:00:02.0: BAR 7: can't allocate [io
0x5000-0x5fff]
[ 16.891591] pci 0000:00:02.0: BAR 8: no parent found for bridge [mem
0x9bb00000-0x9bbfffff]
[ 16.901010] pci 0000:00:02.0: BAR 8: can't allocate [mem
0x9bb00000-0x9bbfffff]
[ 16.909264] pci 0000:00:03.0: BAR 7: no parent found for bridge [io
0x4000-0x4fff]
[ 16.917908] pci 0000:00:03.0: BAR 7: can't allocate [io
0x4000-0x4fff]
[...]
I can provide a full log if needed.
--
Jens Axboe
^ permalink raw reply [flat|nested] 42+ messages in thread* Re: kexec boot regression 2009-12-15 11:50 kexec boot regression Jens Axboe @ 2009-12-15 12:01 ` Yinghai Lu 2009-12-15 12:14 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 12:01 UTC (permalink / raw) To: Jens Axboe; +Cc: Linux Kernel, mingo, rdreier Jens Axboe wrote: > Hi, > > I have this big box that takes forever to boot, so I use kexec to boot > into new kernels. Works fine, but some time past 2.6.32 it stopped > working. Instead of wasting brain cycles on finding out why, I handed > the problem to my trusty regression friend - git bisect. > > This is what it found (sorry Yinghai it's you again, you owe me a beer > for hours of 2.6.32-git bisecting ;-) sure. > > > 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit > commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d > Author: Yinghai Lu <yinghai@kernel.org> > Date: Sun Oct 4 21:54:24 2009 -0700 > > x86/PCI: read root resources from IOH on Intel > > For intel systems with multi IOH, we should read peer root resources > directly from PCI config space, and don't trust _CRS. > > > I could not revert this single commit, as a further commit made other > changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed > that this kernel then works fine. > let see how BIOS mess it up again! > With current -git, I get tons and tons of: > > [ 16.841724] pci 0000:00:01.0: BAR 7: no parent found for bridge [io > 0x6000-0x6fff] > [ 16.850368] pci 0000:00:01.0: BAR 7: can't allocate [io > 0x6000-0x6fff] > [ 16.857821] pci 0000:00:01.0: BAR 8: no parent found for bridge [mem > 0x9bc00000-0x9bcfffff] > [ 16.867238] pci 0000:00:01.0: BAR 8: can't allocate [mem > 0x9bc00000-0x9bcfffff] > [ 16.875492] pci 0000:00:02.0: BAR 7: no parent found for bridge [io > 0x5000-0x5fff] > [ 16.884137] pci 0000:00:02.0: BAR 7: can't allocate [io > 0x5000-0x5fff] > [ 16.891591] pci 0000:00:02.0: BAR 8: no parent found for bridge [mem > 0x9bb00000-0x9bbfffff] > [ 16.901010] pci 0000:00:02.0: BAR 8: can't allocate [mem > 0x9bb00000-0x9bbfffff] > [ 16.909264] pci 0000:00:03.0: BAR 7: no parent found for bridge [io > 0x4000-0x4fff] > [ 16.917908] pci 0000:00:03.0: BAR 7: can't allocate [io > 0x4000-0x4fff] > [...] > > I can provide a full log if needed. please. YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 12:01 ` Yinghai Lu @ 2009-12-15 12:14 ` Jens Axboe 2009-12-15 12:31 ` Yinghai Lu 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 12:14 UTC (permalink / raw) To: Yinghai Lu; +Cc: Linux Kernel, mingo, rdreier [-- Attachment #1: Type: text/plain, Size: 1343 bytes --] On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > Hi, > > > > I have this big box that takes forever to boot, so I use kexec to boot > > into new kernels. Works fine, but some time past 2.6.32 it stopped > > working. Instead of wasting brain cycles on finding out why, I handed > > the problem to my trusty regression friend - git bisect. > > > > This is what it found (sorry Yinghai it's you again, you owe me a beer > > for hours of 2.6.32-git bisecting ;-) > > sure. > > > > > > > 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit > > commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d > > Author: Yinghai Lu <yinghai@kernel.org> > > Date: Sun Oct 4 21:54:24 2009 -0700 > > > > x86/PCI: read root resources from IOH on Intel > > > > For intel systems with multi IOH, we should read peer root resources > > directly from PCI config space, and don't trust _CRS. > > > > > > I could not revert this single commit, as a further commit made other > > changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed > > that this kernel then works fine. > > > > let see how BIOS mess it up again! Heh, I had a feeling this was coming :-) > please. Please find two logs attached - one from a boot with -git and the two patches reverted, and one from a boot with -git. -- Jens Axboe [-- Attachment #2: good-boot.log.gz --] [-- Type: application/octet-stream, Size: 15915 bytes --] [-- Attachment #3: bad-boot.log.gz --] [-- Type: application/octet-stream, Size: 14734 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 12:14 ` Jens Axboe @ 2009-12-15 12:31 ` Yinghai Lu 2009-12-15 12:39 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 12:31 UTC (permalink / raw) To: Jens Axboe; +Cc: Linux Kernel, mingo, rdreier Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> Jens Axboe wrote: >>> Hi, >>> >>> I have this big box that takes forever to boot, so I use kexec to boot >>> into new kernels. Works fine, but some time past 2.6.32 it stopped >>> working. Instead of wasting brain cycles on finding out why, I handed >>> the problem to my trusty regression friend - git bisect. >>> >>> This is what it found (sorry Yinghai it's you again, you owe me a beer >>> for hours of 2.6.32-git bisecting ;-) >> sure. >> >>> >>> 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit >>> commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d >>> Author: Yinghai Lu <yinghai@kernel.org> >>> Date: Sun Oct 4 21:54:24 2009 -0700 >>> >>> x86/PCI: read root resources from IOH on Intel >>> >>> For intel systems with multi IOH, we should read peer root resources >>> directly from PCI config space, and don't trust _CRS. >>> >>> >>> I could not revert this single commit, as a further commit made other >>> changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed >>> that this kernel then works fine. >>> >> let see how BIOS mess it up again! > > Heh, I had a feeling this was coming :-) > >> please. > > Please find two logs attached - one from a boot with -git and the two > patches reverted, and one from a boot with -git. please enabled CONFIG_PCI_DEBUG and boot with debug in boot command line. Thanks Yinghai ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 12:31 ` Yinghai Lu @ 2009-12-15 12:39 ` Jens Axboe 2009-12-15 12:55 ` Yinghai Lu 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 12:39 UTC (permalink / raw) To: Yinghai Lu; +Cc: Linux Kernel, mingo, rdreier On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> Jens Axboe wrote: > >>> Hi, > >>> > >>> I have this big box that takes forever to boot, so I use kexec to boot > >>> into new kernels. Works fine, but some time past 2.6.32 it stopped > >>> working. Instead of wasting brain cycles on finding out why, I handed > >>> the problem to my trusty regression friend - git bisect. > >>> > >>> This is what it found (sorry Yinghai it's you again, you owe me a beer > >>> for hours of 2.6.32-git bisecting ;-) > >> sure. > >> > >>> > >>> 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit > >>> commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d > >>> Author: Yinghai Lu <yinghai@kernel.org> > >>> Date: Sun Oct 4 21:54:24 2009 -0700 > >>> > >>> x86/PCI: read root resources from IOH on Intel > >>> > >>> For intel systems with multi IOH, we should read peer root resources > >>> directly from PCI config space, and don't trust _CRS. > >>> > >>> > >>> I could not revert this single commit, as a further commit made other > >>> changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed > >>> that this kernel then works fine. > >>> > >> let see how BIOS mess it up again! > > > > Heh, I had a feeling this was coming :-) > > > >> please. > > > > Please find two logs attached - one from a boot with -git and the two > > patches reverted, and one from a boot with -git. > > please enabled CONFIG_PCI_DEBUG and boot with debug in boot command line. On the good or bad kernel? -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 12:39 ` Jens Axboe @ 2009-12-15 12:55 ` Yinghai Lu 2009-12-15 14:11 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 12:55 UTC (permalink / raw) To: Jens Axboe; +Cc: Linux Kernel, mingo, rdreier Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> Jens Axboe wrote: >>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>> Jens Axboe wrote: >>>>> Hi, >>>>> >>>>> I have this big box that takes forever to boot, so I use kexec to boot >>>>> into new kernels. Works fine, but some time past 2.6.32 it stopped >>>>> working. Instead of wasting brain cycles on finding out why, I handed >>>>> the problem to my trusty regression friend - git bisect. >>>>> >>>>> This is what it found (sorry Yinghai it's you again, you owe me a beer >>>>> for hours of 2.6.32-git bisecting ;-) >>>> sure. >>>> >>>>> 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit >>>>> commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d >>>>> Author: Yinghai Lu <yinghai@kernel.org> >>>>> Date: Sun Oct 4 21:54:24 2009 -0700 >>>>> >>>>> x86/PCI: read root resources from IOH on Intel >>>>> >>>>> For intel systems with multi IOH, we should read peer root resources >>>>> directly from PCI config space, and don't trust _CRS. >>>>> >>>>> >>>>> I could not revert this single commit, as a further commit made other >>>>> changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed >>>>> that this kernel then works fine. >>>>> >>>> let see how BIOS mess it up again! >>> Heh, I had a feeling this was coming :-) >>> >>>> please. >>> Please find two logs attached - one from a boot with -git and the two >>> patches reverted, and one from a boot with -git. >> please enabled CONFIG_PCI_DEBUG and boot with debug in boot command line. > > On the good or bad kernel? both please. YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 12:55 ` Yinghai Lu @ 2009-12-15 14:11 ` Jens Axboe 2009-12-15 18:39 ` Yinghai Lu 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 14:11 UTC (permalink / raw) To: Yinghai Lu; +Cc: Linux Kernel, mingo, rdreier [-- Attachment #1: Type: text/plain, Size: 1738 bytes --] On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> Jens Axboe wrote: > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>> Jens Axboe wrote: > >>>>> Hi, > >>>>> > >>>>> I have this big box that takes forever to boot, so I use kexec to boot > >>>>> into new kernels. Works fine, but some time past 2.6.32 it stopped > >>>>> working. Instead of wasting brain cycles on finding out why, I handed > >>>>> the problem to my trusty regression friend - git bisect. > >>>>> > >>>>> This is what it found (sorry Yinghai it's you again, you owe me a beer > >>>>> for hours of 2.6.32-git bisecting ;-) > >>>> sure. > >>>> > >>>>> 99935a7a59eaca0292c1a5880e10bae03f4a5e3d is the first bad commit > >>>>> commit 99935a7a59eaca0292c1a5880e10bae03f4a5e3d > >>>>> Author: Yinghai Lu <yinghai@kernel.org> > >>>>> Date: Sun Oct 4 21:54:24 2009 -0700 > >>>>> > >>>>> x86/PCI: read root resources from IOH on Intel > >>>>> > >>>>> For intel systems with multi IOH, we should read peer root resources > >>>>> directly from PCI config space, and don't trust _CRS. > >>>>> > >>>>> > >>>>> I could not revert this single commit, as a further commit made other > >>>>> changes. So I reverted 67f241f4 first and then 99935a7a. I confirmed > >>>>> that this kernel then works fine. > >>>>> > >>>> let see how BIOS mess it up again! > >>> Heh, I had a feeling this was coming :-) > >>> > >>>> please. > >>> Please find two logs attached - one from a boot with -git and the two > >>> patches reverted, and one from a boot with -git. > >> please enabled CONFIG_PCI_DEBUG and boot with debug in boot command line. > > > > On the good or bad kernel? > > both please. Attached. -- Jens Axboe [-- Attachment #2: good-log-debug.txt.gz --] [-- Type: application/octet-stream, Size: 41724 bytes --] [-- Attachment #3: bad-log-debug.txt.gz --] [-- Type: application/octet-stream, Size: 38731 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 14:11 ` Jens Axboe @ 2009-12-15 18:39 ` Yinghai Lu 2009-12-15 18:47 ` Matthew Wilcox ` (3 more replies) 0 siblings, 4 replies; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 18:39 UTC (permalink / raw) To: Jens Axboe, Jesse Barnes Cc: Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>> >>>>>> let see how BIOS mess it up again! >>>>> Heh, I had a feeling this was coming :-) [ 0.000000] user-defined physical RAM map: [ 0.000000] user: 0000000000000100 - 0000000000098800 (usable) [ 0.000000] user: 0000000000098800 - 00000000000a0000 (reserved) [ 0.000000] user: 00000000000e0000 - 0000000000100000 (reserved) [ 0.000000] user: 0000000000100000 - 0000000078c63000 (usable) [ 0.000000] user: 0000000078c63000 - 0000000078e77000 (ACPI NVS) [ 0.000000] user: 0000000078e77000 - 000000007924e000 (ACPI data) [ 0.000000] user: 000000007924e000 - 00000000792c2000 (reserved) [ 0.000000] user: 00000000792c2000 - 00000000792d2000 (ACPI data) [ 0.000000] user: 00000000792d2000 - 00000000792e7000 (reserved) [ 0.000000] user: 00000000792e7000 - 0000000079301000 (ACPI data) [ 0.000000] user: 0000000079301000 - 0000000079303000 (reserved) [ 0.000000] user: 0000000079303000 - 0000000079305000 (ACPI data) [ 0.000000] user: 0000000079305000 - 0000000079310000 (reserved) [ 0.000000] user: 0000000079310000 - 0000000079314000 (ACPI data) [ 0.000000] user: 0000000079314000 - 0000000079319000 (reserved) [ 0.000000] user: 0000000079319000 - 0000000079336000 (ACPI data) [ 0.000000] user: 0000000079336000 - 0000000079358000 (reserved) [ 0.000000] user: 0000000079358000 - 0000000079388000 (ACPI data) [ 0.000000] user: 0000000079388000 - 00000000793c9000 (reserved) [ 0.000000] user: 00000000793c9000 - 000000007968f000 (ACPI data) [ 0.000000] user: 000000007968f000 - 00000000796bb000 (reserved) [ 0.000000] user: 00000000796bb000 - 00000000799d8000 (ACPI data) [ 0.000000] user: 00000000799d8000 - 0000000079bd8000 (ACPI NVS) [ 0.000000] user: 0000000079bd8000 - 0000000079d87000 (ACPI data) [ 0.000000] user: 0000000079d87000 - 0000000079d8a000 (reserved) [ 0.000000] user: 0000000079d8a000 - 0000000079dca000 (ACPI data) [ 0.000000] user: 0000000079dca000 - 0000000079dcb000 (reserved) [ 0.000000] user: 0000000079dcb000 - 0000000079e1c000 (ACPI data) [ 0.000000] user: 0000000079e1c000 - 0000000079e87000 (reserved) [ 0.000000] user: 0000000079e87000 - 000000007bd5f000 (ACPI data) [ 0.000000] user: 000000007bd5f000 - 000000007be4f000 (reserved) [ 0.000000] user: 000000007be4f000 - 000000007bf87000 (ACPI data) [ 0.000000] user: 0000000100000000 - 0000001080000000 (usable) ... [ 0.000000] SRAT: Node 0 PXM 0 0-80000000 [ 0.000000] SRAT: Node 0 PXM 0 100000000-480000000 [ 0.000000] SRAT: Node 2 PXM 1 480000000-880000000 [ 0.000000] SRAT: Node 1 PXM 2 880000000-c80000000 [ 0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000 [ 0.000000] ACPI: [SRAT:0x01] ignored 16 entries of 32 found [ 0.000000] NUMA: Using 31 for the hash shift. [ 0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. [ 0.000000] SRAT: SRAT not used. [ 0.000000] No NUMA configuration found so SRAT is broken? if (max_entries && count > max_entries) { printk(KERN_WARNING PREFIX "[%4.4s:0x%02x] ignored %i entries of " "%i found\n", id, entry_id, count - max_entries, count); } ... or what is your CONFIG_NODES_SHIFT? 3? can you try to set it to 6? [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources [ 13.112475] PCI: not using MMCONFIG [ 13.206650] ACPI: No dock devices found. so mmconf is not used...<ask BIOS fix it please!> then we get [ 13.990335] IOH bus: [00, 00] [ 13.993707] IOH bus: 00 index 0 io port: [0, fff] [ 13.999023] IOH bus: 00 index 1 mmio: [0, ffffff] [ 14.004335] IOH bus: 00 index 2 mmio: [0, 3ffffff] please check [PATCH] x86/pci: intel ioh bus num reg accessing fix it is above 0x100, so if mmconf is not enable, need to skip it Reported-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- arch/x86/pci/intel_bus.c | 4 ++++ 1 file changed, 4 insertions(+) Index: linux-2.6/arch/x86/pci/intel_bus.c =================================================================== --- linux-2.6.orig/arch/x86/pci/intel_bus.c +++ linux-2.6/arch/x86/pci/intel_bus.c @@ -49,6 +49,10 @@ static void __devinit pci_root_bus_res(s u64 mmioh_base, mmioh_end; int bus_base, bus_end; + /* some sys doesn't get mmconf enabled */ + if (dev->cfg_size < 0x200) + return; + if (pci_root_num >= PCI_ROOT_NR) { printk(KERN_DEBUG "intel_bus.c: PCI_ROOT_NR is too small\n"); return; ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 18:39 ` Yinghai Lu @ 2009-12-15 18:47 ` Matthew Wilcox 2009-12-15 18:54 ` Jens Axboe ` (2 subsequent siblings) 3 siblings, 0 replies; 42+ messages in thread From: Matthew Wilcox @ 2009-12-15 18:47 UTC (permalink / raw) To: Yinghai Lu Cc: Jens Axboe, Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15, 2009 at 10:39:37AM -0800, Yinghai Lu wrote: > + /* some sys doesn't get mmconf enabled */ > + if (dev->cfg_size < 0x200) > + return; What is the meaning of this mystic 0x200? -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 18:39 ` Yinghai Lu 2009-12-15 18:47 ` Matthew Wilcox @ 2009-12-15 18:54 ` Jens Axboe 2009-12-15 18:59 ` Jens Axboe 2009-12-15 19:43 ` kexec boot regression Jens Axboe 3 siblings, 0 replies; 42+ messages in thread From: Jens Axboe @ 2009-12-15 18:54 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>>>>> > >>>>>> let see how BIOS mess it up again! > >>>>> Heh, I had a feeling this was coming :-) > > [ 0.000000] user-defined physical RAM map: > > [ 0.000000] user: 0000000000000100 - 0000000000098800 (usable) > > [ 0.000000] user: 0000000000098800 - 00000000000a0000 (reserved) > > [ 0.000000] user: 00000000000e0000 - 0000000000100000 (reserved) > > [ 0.000000] user: 0000000000100000 - 0000000078c63000 (usable) > > [ 0.000000] user: 0000000078c63000 - 0000000078e77000 (ACPI NVS) > > [ 0.000000] user: 0000000078e77000 - 000000007924e000 (ACPI data) > > [ 0.000000] user: 000000007924e000 - 00000000792c2000 (reserved) > > [ 0.000000] user: 00000000792c2000 - 00000000792d2000 (ACPI data) > > [ 0.000000] user: 00000000792d2000 - 00000000792e7000 (reserved) > > [ 0.000000] user: 00000000792e7000 - 0000000079301000 (ACPI data) > > [ 0.000000] user: 0000000079301000 - 0000000079303000 (reserved) > > [ 0.000000] user: 0000000079303000 - 0000000079305000 (ACPI data) > > > [ 0.000000] user: 0000000079305000 - 0000000079310000 (reserved) > > [ 0.000000] user: 0000000079310000 - 0000000079314000 (ACPI data) > > [ 0.000000] user: 0000000079314000 - 0000000079319000 (reserved) > > [ 0.000000] user: 0000000079319000 - 0000000079336000 (ACPI data) > > [ 0.000000] user: 0000000079336000 - 0000000079358000 (reserved) > > [ 0.000000] user: 0000000079358000 - 0000000079388000 (ACPI data) > > [ 0.000000] user: 0000000079388000 - 00000000793c9000 (reserved) > > [ 0.000000] user: 00000000793c9000 - 000000007968f000 (ACPI data) > > [ 0.000000] user: 000000007968f000 - 00000000796bb000 (reserved) > > [ 0.000000] user: 00000000796bb000 - 00000000799d8000 (ACPI data) > > [ 0.000000] user: 00000000799d8000 - 0000000079bd8000 (ACPI NVS) > > [ 0.000000] user: 0000000079bd8000 - 0000000079d87000 (ACPI data) > > [ 0.000000] user: 0000000079d87000 - 0000000079d8a000 (reserved) > > [ 0.000000] user: 0000000079d8a000 - 0000000079dca000 (ACPI data) > > [ 0.000000] user: 0000000079dca000 - 0000000079dcb000 (reserved) > > [ 0.000000] user: 0000000079dcb000 - 0000000079e1c000 (ACPI data) > > [ 0.000000] user: 0000000079e1c000 - 0000000079e87000 (reserved) > > [ 0.000000] user: 0000000079e87000 - 000000007bd5f000 (ACPI data) > > [ 0.000000] user: 000000007bd5f000 - 000000007be4f000 (reserved) > > [ 0.000000] user: 000000007be4f000 - 000000007bf87000 (ACPI data) > > [ 0.000000] user: 0000000100000000 - 0000001080000000 (usable) > ... > [ 0.000000] SRAT: Node 0 PXM 0 0-80000000 > > [ 0.000000] SRAT: Node 0 PXM 0 100000000-480000000 > > [ 0.000000] SRAT: Node 2 PXM 1 480000000-880000000 > > [ 0.000000] SRAT: Node 1 PXM 2 880000000-c80000000 > > [ 0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000 > > [ 0.000000] ACPI: [SRAT:0x01] ignored 16 entries of 32 found > > [ 0.000000] NUMA: Using 31 for the hash shift. > > [ 0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. > > [ 0.000000] SRAT: SRAT not used. > > [ 0.000000] No NUMA configuration found > > so SRAT is broken? > > if (max_entries && count > max_entries) { > printk(KERN_WARNING PREFIX "[%4.4s:0x%02x] ignored %i entries of " > "%i found\n", id, entry_id, count - max_entries, count); > } > ... > > or what is your CONFIG_NODES_SHIFT? 3? can you try to set it to 6? Hmm funky, perhaps the BIOS changed that too. NUMA has otherwise been working fine, didn't check whether it still did after a BIOS upgrade. I'll try 6, it is set to 3 iirc. > [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > > [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources > > [ 13.112475] PCI: not using MMCONFIG > > [ 13.206650] ACPI: No dock devices found. > > so mmconf is not used...<ask BIOS fix it please!> Reported, thanks. > then we get > > [ 13.990335] IOH bus: [00, 00] > > [ 13.993707] IOH bus: 00 index 0 io port: [0, fff] > > [ 13.999023] IOH bus: 00 index 1 mmio: [0, ffffff] > > [ 14.004335] IOH bus: 00 index 2 mmio: [0, 3ffffff] > > please check > > [PATCH] x86/pci: intel ioh bus num reg accessing fix > > it is above 0x100, so if mmconf is not enable, need to skip it Will check that now. -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 18:39 ` Yinghai Lu 2009-12-15 18:47 ` Matthew Wilcox 2009-12-15 18:54 ` Jens Axboe @ 2009-12-15 18:59 ` Jens Axboe 2009-12-15 19:04 ` Yinghai Lu 2009-12-15 19:43 ` kexec boot regression Jens Axboe 3 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 18:59 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Yinghai Lu wrote: > [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > > [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources On a "normal" non-kexec boot, I get: [ 12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) [ 12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 [ 12.216874] PCI: Using configuration type 1 for base access -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 18:59 ` Jens Axboe @ 2009-12-15 19:04 ` Yinghai Lu 2009-12-15 19:11 ` Jens Axboe 2009-12-15 21:30 ` Markus Trippelsdorf 0 siblings, 2 replies; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 19:04 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) >> >> [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources > > On a "normal" non-kexec boot, I get: > > [ 12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > [ 12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 > [ 12.216874] PCI: Using configuration type 1 for base access > can you run following scripts in first kernel? cd /sys/firmware/memmap for dir in * ; do start=$(cat $dir/start) end=$(cat $dir/end) type=$(cat $dir/type) printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt done and send out /tmp/memmap.txt what is your kexec tools version? could be too old? YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:04 ` Yinghai Lu @ 2009-12-15 19:11 ` Jens Axboe 2009-12-15 19:17 ` Yinghai Lu 2009-12-15 19:44 ` Yinghai Lu 2009-12-15 21:30 ` Markus Trippelsdorf 1 sibling, 2 replies; 42+ messages in thread From: Jens Axboe @ 2009-12-15 19:11 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > >> > >> [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources > > > > On a "normal" non-kexec boot, I get: > > > > [ 12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > > [ 12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 > > [ 12.216874] PCI: Using configuration type 1 for base access > > > > can you run following scripts in first kernel? > > cd /sys/firmware/memmap > for dir in * ; do > start=$(cat $dir/start) > end=$(cat $dir/end) > type=$(cat $dir/type) > printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt > done > > and send out /tmp/memmap.txt Below. > what is your kexec tools version? could be too old? It says: kexec-tools-testing 20080324 released 24th March 2008 0000000000000000-0000000000098800 (System RAM) 0000000000098800-00000000000a0000 (reserved) 0000000079301000-0000000079303000 (reserved) 0000000079303000-0000000079305000 (ACPI Tables) 0000000079305000-0000000079310000 (reserved) 0000000079310000-0000000079314000 (ACPI Tables) 0000000079314000-0000000079319000 (reserved) 0000000079319000-0000000079336000 (ACPI Tables) 0000000079336000-0000000079358000 (reserved) 0000000079358000-0000000079388000 (ACPI Tables) 0000000079388000-00000000793c9000 (reserved) 00000000793c9000-000000007968f000 (ACPI Tables) 00000000000e0000-0000000000100000 (reserved) 000000007968f000-00000000796bb000 (reserved) 00000000796bb000-00000000799d8000 (ACPI Tables) 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage) 0000000079bd8000-0000000079d8b000 (ACPI Tables) 0000000079d8b000-0000000079d8c000 (reserved) 0000000079d8c000-0000000079dc8000 (ACPI Tables) 0000000079dc8000-0000000079dcb000 (reserved) 0000000079dcb000-0000000079e1c000 (ACPI Tables) 0000000079e1c000-0000000079e87000 (reserved) 0000000079e87000-000000007bd5f000 (ACPI Tables) 0000000000100000-0000000078c59000 (System RAM) 000000007bd5f000-000000007be4f000 (reserved) 000000007be4f000-000000007bf87000 (ACPI Tables) 000000007bf87000-000000007bfcf000 (ACPI Non-volatile Storage) 000000007bfcf000-000000007bfff000 (ACPI Tables) 000000007bfff000-0000000090000000 (reserved) 00000000fc000000-00000000fd000000 (reserved) 00000000fed1c000-00000000fed20000 (reserved) 00000000ff000000-0000000100000000 (reserved) 0000000100000000-0000001080000000 (System RAM) 0000000078c59000-0000000078e6d000 (ACPI Non-volatile Storage) 0000000078e6d000-000000007924e000 (ACPI Tables) 000000007924e000-00000000792c2000 (reserved) 00000000792c2000-00000000792d2000 (ACPI Tables) 00000000792d2000-00000000792e7000 (reserved) 00000000792e7000-0000000079301000 (ACPI Tables) -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:11 ` Jens Axboe @ 2009-12-15 19:17 ` Yinghai Lu 2009-12-15 19:22 ` Jens Axboe 2009-12-15 19:44 ` Yinghai Lu 1 sibling, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 19:17 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> Jens Axboe wrote: >>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>> [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) >>>> >>>> [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources >>> On a "normal" non-kexec boot, I get: >>> >>> [ 12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) >>> [ 12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 >>> [ 12.216874] PCI: Using configuration type 1 for base access >>> >> can you run following scripts in first kernel? >> >> cd /sys/firmware/memmap >> for dir in * ; do >> start=$(cat $dir/start) >> end=$(cat $dir/end) >> type=$(cat $dir/type) >> printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt >> done >> >> and send out /tmp/memmap.txt > > Below. > >> what is your kexec tools version? could be too old? > > It says: > > kexec-tools-testing 20080324 released 24th March 2008 > > > 0000000000000000-0000000000098800 (System RAM) > 0000000000098800-00000000000a0000 (reserved) > 0000000079301000-0000000079303000 (reserved) > 0000000079303000-0000000079305000 (ACPI Tables) > 0000000079305000-0000000079310000 (reserved) > 0000000079310000-0000000079314000 (ACPI Tables) > 0000000079314000-0000000079319000 (reserved) > 0000000079319000-0000000079336000 (ACPI Tables) > 0000000079336000-0000000079358000 (reserved) > 0000000079358000-0000000079388000 (ACPI Tables) > 0000000079388000-00000000793c9000 (reserved) > 00000000793c9000-000000007968f000 (ACPI Tables) > 00000000000e0000-0000000000100000 (reserved) > 000000007968f000-00000000796bb000 (reserved) > 00000000796bb000-00000000799d8000 (ACPI Tables) > 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage) > 0000000079bd8000-0000000079d8b000 (ACPI Tables) > 0000000079d8b000-0000000079d8c000 (reserved) > 0000000079d8c000-0000000079dc8000 (ACPI Tables) > 0000000079dc8000-0000000079dcb000 (reserved) > 0000000079dcb000-0000000079e1c000 (ACPI Tables) > 0000000079e1c000-0000000079e87000 (reserved) > 0000000079e87000-000000007bd5f000 (ACPI Tables) > 0000000000100000-0000000078c59000 (System RAM) > 000000007bd5f000-000000007be4f000 (reserved) > 000000007be4f000-000000007bf87000 (ACPI Tables) > 000000007bf87000-000000007bfcf000 (ACPI Non-volatile Storage) > 000000007bfcf000-000000007bfff000 (ACPI Tables) > 000000007bfff000-0000000090000000 (reserved) > 00000000fc000000-00000000fd000000 (reserved) > 00000000fed1c000-00000000fed20000 (reserved) > 00000000ff000000-0000000100000000 (reserved) > 0000000100000000-0000001080000000 (System RAM) > 0000000078c59000-0000000078e6d000 (ACPI Non-volatile Storage) > 0000000078e6d000-000000007924e000 (ACPI Tables) > 000000007924e000-00000000792c2000 (reserved) > 00000000792c2000-00000000792d2000 (ACPI Tables) > 00000000792d2000-00000000792e7000 (reserved) > 00000000792e7000-0000000079301000 (ACPI Tables) > boot log of first kernel? YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:17 ` Yinghai Lu @ 2009-12-15 19:22 ` Jens Axboe 2009-12-15 19:28 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 19:22 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> Jens Axboe wrote: > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>> [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > >>>> > >>>> [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources > >>> On a "normal" non-kexec boot, I get: > >>> > >>> [ 12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > >>> [ 12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 > >>> [ 12.216874] PCI: Using configuration type 1 for base access > >>> > >> can you run following scripts in first kernel? > >> > >> cd /sys/firmware/memmap > >> for dir in * ; do > >> start=$(cat $dir/start) > >> end=$(cat $dir/end) > >> type=$(cat $dir/type) > >> printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt > >> done > >> > >> and send out /tmp/memmap.txt > > > > Below. > > > >> what is your kexec tools version? could be too old? > > > > It says: > > > > kexec-tools-testing 20080324 released 24th March 2008 > > > > > > 0000000000000000-0000000000098800 (System RAM) > > 0000000000098800-00000000000a0000 (reserved) > > 0000000079301000-0000000079303000 (reserved) > > 0000000079303000-0000000079305000 (ACPI Tables) > > 0000000079305000-0000000079310000 (reserved) > > 0000000079310000-0000000079314000 (ACPI Tables) > > 0000000079314000-0000000079319000 (reserved) > > 0000000079319000-0000000079336000 (ACPI Tables) > > 0000000079336000-0000000079358000 (reserved) > > 0000000079358000-0000000079388000 (ACPI Tables) > > 0000000079388000-00000000793c9000 (reserved) > > 00000000793c9000-000000007968f000 (ACPI Tables) > > 00000000000e0000-0000000000100000 (reserved) > > 000000007968f000-00000000796bb000 (reserved) > > 00000000796bb000-00000000799d8000 (ACPI Tables) > > 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage) > > 0000000079bd8000-0000000079d8b000 (ACPI Tables) > > 0000000079d8b000-0000000079d8c000 (reserved) > > 0000000079d8c000-0000000079dc8000 (ACPI Tables) > > 0000000079dc8000-0000000079dcb000 (reserved) > > 0000000079dcb000-0000000079e1c000 (ACPI Tables) > > 0000000079e1c000-0000000079e87000 (reserved) > > 0000000079e87000-000000007bd5f000 (ACPI Tables) > > 0000000000100000-0000000078c59000 (System RAM) > > 000000007bd5f000-000000007be4f000 (reserved) > > 000000007be4f000-000000007bf87000 (ACPI Tables) > > 000000007bf87000-000000007bfcf000 (ACPI Non-volatile Storage) > > 000000007bfcf000-000000007bfff000 (ACPI Tables) > > 000000007bfff000-0000000090000000 (reserved) > > 00000000fc000000-00000000fd000000 (reserved) > > 00000000fed1c000-00000000fed20000 (reserved) > > 00000000ff000000-0000000100000000 (reserved) > > 0000000100000000-0000001080000000 (System RAM) > > 0000000078c59000-0000000078e6d000 (ACPI Non-volatile Storage) > > 0000000078e6d000-000000007924e000 (ACPI Tables) > > 000000007924e000-00000000792c2000 (reserved) > > 00000000792c2000-00000000792d2000 (ACPI Tables) > > 00000000792d2000-00000000792e7000 (reserved) > > 00000000792e7000-0000000079301000 (ACPI Tables) > > > > boot log of first kernel? Hmm not completely sure, let me re-do it after a cold boot. BTW, I just checked, and 2.6.32 has NUMA working fine. Below is the SRAT and NUMA output from 2.6.32 (kexec'ed kernel). Is the check a newly introduced one? [ 0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 64 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 32 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 96 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 2 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 66 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 34 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 98 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 4 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 68 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 36 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 100 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 6 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 70 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 38 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 102 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 16 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 80 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 48 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 112 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 18 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 82 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 50 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 114 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 20 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 84 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 52 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 116 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 22 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 86 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 54 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 118 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 1 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 65 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 33 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 97 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 3 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 67 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 35 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 99 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 5 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 69 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 37 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 101 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 7 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 71 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 39 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 103 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 17 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 81 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 49 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 113 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 19 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 83 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 51 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 115 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 21 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 85 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 53 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 117 -> Node 3 [ 0.000000] SRAT: PXM 0 -> APIC 23 -> Node 0 [ 0.000000] SRAT: PXM 2 -> APIC 87 -> Node 1 [ 0.000000] SRAT: PXM 1 -> APIC 55 -> Node 2 [ 0.000000] SRAT: PXM 3 -> APIC 119 -> Node 3 [ 0.000000] SRAT: Node 0 PXM 0 0-80000000 [ 0.000000] SRAT: Node 0 PXM 0 100000000-480000000 [ 0.000000] SRAT: Node 2 PXM 1 480000000-880000000 [ 0.000000] SRAT: Node 1 PXM 2 880000000-c80000000 [ 0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000 [ 0.000000] NUMA: Using 31 for the hash shift. [ 0.000000] Bootmem setup node 0 0000000000000000-0000000480000000 [ 0.000000] NODE_DATA [0000000000048000 - 000000000004cfff] [ 0.000000] bootmap [0000000000100000 - 000000000018ffff] pages 90 [ 0.000000] (8 early reservations) ==> bootmem [0000000000 - 0480000000] [ 0.000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] [ 0.000000] #1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000] [ 0.000000] #2 [0001000000 - 000200f260] TEXT DATA BSS ==> [0001000000 - 000200f260] [ 0.000000] #3 [0000098800 - 0000100000] BIOS reserved ==> [0000098800 - 0000100000] [ 0.000000] #4 [0002010000 - 000201035c] BRK ==> [0002010000 - 000201035c] [ 0.000000] #5 [0000008000 - 000000a000] PGTABLE ==> [0000008000 - 000000a000] [ 0.000000] #6 [000000a000 - 0000048000] PGTABLE ==> [000000a000 - 0000048000] [ 0.000000] #7 [0000001000 - 000000103c] ACPI SLIT ==> [0000001000 - 000000103c] [ 0.000000] Bootmem setup node 1 0000000880000000-0000000c80000000 [ 0.000000] NODE_DATA [0000000880000000 - 0000000880004fff] [ 0.000000] bootmap [0000000880005000 - 0000000880084fff] pages 80 [ 0.000000] (8 early reservations) ==> bootmem [0880000000 - 0c80000000] [ 0.000000] #0 [0000000000 - 0000001000] BIOS data page [ 0.000000] #1 [0000006000 - 0000008000] TRAMPOLINE [ 0.000000] #2 [0001000000 - 000200f260] TEXT DATA BSS [ 0.000000] #3 [0000098800 - 0000100000] BIOS reserved [ 0.000000] #4 [0002010000 - 000201035c] BRK [ 0.000000] #5 [0000008000 - 000000a000] PGTABLE [ 0.000000] #6 [000000a000 - 0000048000] PGTABLE [ 0.000000] #7 [0000001000 - 000000103c] ACPI SLIT [ 0.000000] Bootmem setup node 2 0000000480000000-0000000880000000 [ 0.000000] NODE_DATA [0000000480000000 - 0000000480004fff] [ 0.000000] bootmap [0000000480005000 - 0000000480084fff] pages 80 [ 0.000000] (8 early reservations) ==> bootmem [0480000000 - 0880000000] [ 0.000000] #0 [0000000000 - 0000001000] BIOS data page [ 0.000000] #1 [0000006000 - 0000008000] TRAMPOLINE [ 0.000000] #2 [0001000000 - 000200f260] TEXT DATA BSS [ 0.000000] #3 [0000098800 - 0000100000] BIOS reserved [ 0.000000] #4 [0002010000 - 000201035c] BRK [ 0.000000] #5 [0000008000 - 000000a000] PGTABLE [ 0.000000] #6 [000000a000 - 0000048000] PGTABLE [ 0.000000] #7 [0000001000 - 000000103c] ACPI SLIT [ 0.000000] Bootmem setup node 3 0000000c80000000-0000001080000000 [ 0.000000] NODE_DATA [0000000c80000000 - 0000000c80004fff] [ 0.000000] bootmap [0000000c80005000 - 0000000c80084fff] pages 80 [ 0.000000] (8 early reservations) ==> bootmem [0c80000000 - 1080000000] [ 0.000000] #0 [0000000000 - 0000001000] BIOS data page [ 0.000000] #1 [0000006000 - 0000008000] TRAMPOLINE [ 0.000000] #2 [0001000000 - 000200f260] TEXT DATA BSS [ 0.000000] #3 [0000098800 - 0000100000] BIOS reserved [ 0.000000] #4 [0002010000 - 000201035c] BRK [ 0.000000] #5 [0000008000 - 000000a000] PGTABLE [ 0.000000] #6 [000000a000 - 0000048000] PGTABLE [ 0.000000] #7 [0000001000 - 000000103c] ACPI SLIT [ 0.000000] found SMP MP-table at [ffff8800000fddb0] fddb0 [ 0.000000] [ffffea0000000000-ffffea001d3fffff] PMD -> [ffff880028600000-ffff8800425fffff] on node 0 [ 0.000000] [ffffea001d400000-ffffea00373fffff] PMD -> [ffff880480200000-ffff88049a1fffff] on node 2 [ 0.000000] [ffffea0037400000-ffffea003fffffff] PMD -> [ffff880880200000-ffff880888dfffff] on node 1 [ 0.000000] [ffffea0040000000-ffffea00513fffff] PMD -> [ffff880889000000-ffff88089a3fffff] on node 1 [ 0.000000] [ffffea0051400000-ffffea006b3fffff] PMD -> [ffff880c80200000-ffff880c9a1fffff] on node 3 [ 0.000000] Zone PFN ranges: [ 0.000000] DMA 0x00000001 -> 0x00001000 [ 0.000000] DMA32 0x00001000 -> 0x00100000 [ 0.000000] Normal 0x00100000 -> 0x01080000 [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[6] active PFN ranges [ 0.000000] 0: 0x00000001 -> 0x00000098 [ 0.000000] 0: 0x00000100 -> 0x00078c59 [ 0.000000] 0: 0x00100000 -> 0x00480000 [ 0.000000] 2: 0x00480000 -> 0x00880000 [ 0.000000] 1: 0x00880000 -> 0x00c80000 [ 0.000000] 3: 0x00c80000 -> 0x01080000 [ 0.000000] On node 0 totalpages: 4164592 [ 0.000000] DMA zone: 104 pages used for memmap [ 0.000000] DMA zone: 185 pages reserved [ 0.000000] DMA zone: 3702 pages, LIFO batch:0 [ 0.000000] DMA32 zone: 26520 pages used for memmap [ 0.000000] DMA32 zone: 464065 pages, LIFO batch:31 [ 0.000000] Normal zone: 93184 pages used for memmap [ 0.000000] Normal zone: 3576832 pages, LIFO batch:31 [ 0.000000] On node 1 totalpages: 4194304 [ 0.000000] Normal zone: 106496 pages used for memmap [ 0.000000] Normal zone: 4087808 pages, LIFO batch:31 [ 0.000000] On node 2 totalpages: 4194304 [ 0.000000] Normal zone: 106496 pages used for memmap [ 0.000000] Normal zone: 4087808 pages, LIFO batch:31 [ 0.000000] On node 3 totalpages: 4194304 [ 0.000000] Normal zone: 106496 pages used for memmap [ 0.000000] Normal zone: 4087808 pages, LIFO batch:31 -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:22 ` Jens Axboe @ 2009-12-15 19:28 ` Jens Axboe 0 siblings, 0 replies; 42+ messages in thread From: Jens Axboe @ 2009-12-15 19:28 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Jens Axboe wrote: > > boot log of first kernel? > > Hmm not completely sure, let me re-do it after a cold boot. This is from a cold boot of 2.6.32. 0000000000000000-0000000000098800 (System RAM) 0000000000098800-00000000000a0000 (reserved) 0000000079301000-0000000079303000 (reserved) 0000000079303000-0000000079305000 (ACPI Tables) 0000000079305000-0000000079310000 (reserved) 0000000079310000-0000000079314000 (ACPI Tables) 0000000079314000-0000000079319000 (reserved) 0000000079319000-0000000079336000 (ACPI Tables) 0000000079336000-0000000079358000 (reserved) 0000000079358000-0000000079388000 (ACPI Tables) 0000000079388000-00000000793c9000 (reserved) 00000000793c9000-000000007968f000 (ACPI Tables) 00000000000e0000-0000000000100000 (reserved) 000000007968f000-00000000796bb000 (reserved) 00000000796bb000-00000000799d8000 (ACPI Tables) 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage) 0000000079bd8000-0000000079d87000 (ACPI Tables) 0000000079d87000-0000000079d8a000 (reserved) 0000000079d8a000-0000000079dca000 (ACPI Tables) 0000000079dca000-0000000079dcb000 (reserved) 0000000079dcb000-0000000079e1c000 (ACPI Tables) 0000000079e1c000-0000000079e87000 (reserved) 0000000079e87000-000000007bd5f000 (ACPI Tables) 0000000000100000-0000000078c63000 (System RAM) 000000007bd5f000-000000007be4f000 (reserved) 000000007be4f000-000000007bf87000 (ACPI Tables) 000000007bf87000-000000007bfcf000 (ACPI Non-volatile Storage) 000000007bfcf000-000000007bfff000 (ACPI Tables) 000000007bfff000-0000000090000000 (reserved) 00000000fc000000-00000000fd000000 (reserved) 00000000fed1c000-00000000fed20000 (reserved) 00000000ff000000-0000000100000000 (reserved) 0000000100000000-0000001080000000 (System RAM) 0000000078c63000-0000000078e77000 (ACPI Non-volatile Storage) 0000000078e77000-000000007924e000 (ACPI Tables) 000000007924e000-00000000792c2000 (reserved) 00000000792c2000-00000000792d2000 (ACPI Tables) 00000000792d2000-00000000792e7000 (reserved) 00000000792e7000-0000000079301000 (ACPI Tables) -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:11 ` Jens Axboe 2009-12-15 19:17 ` Yinghai Lu @ 2009-12-15 19:44 ` Yinghai Lu 2009-12-15 19:48 ` Jens Axboe 1 sibling, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 19:44 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> Jens Axboe wrote: >>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>> [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) >>>> >>>> [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources >>> On a "normal" non-kexec boot, I get: >>> >>> [ 12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) >>> [ 12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 >>> [ 12.216874] PCI: Using configuration type 1 for base access >>> >> can you run following scripts in first kernel? >> >> cd /sys/firmware/memmap >> for dir in * ; do >> start=$(cat $dir/start) >> end=$(cat $dir/end) >> type=$(cat $dir/type) >> printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt >> done >> >> and send out /tmp/memmap.txt > > Below. > >> what is your kexec tools version? could be too old? > > It says: > > kexec-tools-testing 20080324 released 24th March 2008 > > > 0000000000000000-0000000000098800 (System RAM) > 0000000000098800-00000000000a0000 (reserved) > 0000000079301000-0000000079303000 (reserved) > 0000000079303000-0000000079305000 (ACPI Tables) > 0000000079305000-0000000079310000 (reserved) > 0000000079310000-0000000079314000 (ACPI Tables) > 0000000079314000-0000000079319000 (reserved) > 0000000079319000-0000000079336000 (ACPI Tables) > 0000000079336000-0000000079358000 (reserved) > 0000000079358000-0000000079388000 (ACPI Tables) > 0000000079388000-00000000793c9000 (reserved) > 00000000793c9000-000000007968f000 (ACPI Tables) > 00000000000e0000-0000000000100000 (reserved) > 000000007968f000-00000000796bb000 (reserved) > 00000000796bb000-00000000799d8000 (ACPI Tables) > 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage) > 0000000079bd8000-0000000079d8b000 (ACPI Tables) > 0000000079d8b000-0000000079d8c000 (reserved) > 0000000079d8c000-0000000079dc8000 (ACPI Tables) > 0000000079dc8000-0000000079dcb000 (reserved) > 0000000079dcb000-0000000079e1c000 (ACPI Tables) > 0000000079e1c000-0000000079e87000 (reserved) > 0000000079e87000-000000007bd5f000 (ACPI Tables) > 0000000000100000-0000000078c59000 (System RAM) > 000000007bd5f000-000000007be4f000 (reserved) > 000000007be4f000-000000007bf87000 (ACPI Tables) so following ranges are not passed to second kernel by kexec? > 000000007bf87000-000000007bfcf000 (ACPI Non-volatile Storage) > 000000007bfcf000-000000007bfff000 (ACPI Tables) > 000000007bfff000-0000000090000000 (reserved) > 00000000fc000000-00000000fd000000 (reserved) > 00000000fed1c000-00000000fed20000 (reserved) > 00000000ff000000-0000000100000000 (reserved) > 0000000100000000-0000001080000000 (System RAM) > 0000000078c59000-0000000078e6d000 (ACPI Non-volatile Storage) > 0000000078e6d000-000000007924e000 (ACPI Tables) > 000000007924e000-00000000792c2000 (reserved) > 00000000792c2000-00000000792d2000 (ACPI Tables) > 00000000792d2000-00000000792e7000 (reserved) > 00000000792e7000-0000000079301000 (ACPI Tables) > second kernel only get [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000100 - 0000000000098800 (usable) [ 0.000000] BIOS-e820: 0000000000098800 - 00000000000a0000 (reserved) [ 0.000000] BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) [ 0.000000] BIOS-e820: 0000000000100000 - 0000000078c63000 (usable) [ 0.000000] BIOS-e820: 0000000078c63000 - 0000000078e77000 (ACPI NVS) [ 0.000000] BIOS-e820: 0000000078e77000 - 000000007924e000 (ACPI data) [ 0.000000] BIOS-e820: 000000007924e000 - 00000000792c2000 (reserved) [ 0.000000] BIOS-e820: 00000000792c2000 - 00000000792d2000 (ACPI data) [ 0.000000] BIOS-e820: 00000000792d2000 - 00000000792e7000 (reserved) [ 0.000000] BIOS-e820: 00000000792e7000 - 0000000079301000 (ACPI data) [ 0.000000] BIOS-e820: 0000000079301000 - 0000000079303000 (reserved) [ 0.000000] BIOS-e820: 0000000079303000 - 0000000079305000 (ACPI data) [ 0.000000] BIOS-e820: 0000000079305000 - 0000000079310000 (reserved) [ 0.000000] BIOS-e820: 0000000079310000 - 0000000079314000 (ACPI data) [ 0.000000] BIOS-e820: 0000000079314000 - 0000000079319000 (reserved) [ 0.000000] BIOS-e820: 0000000079319000 - 0000000079336000 (ACPI data) [ 0.000000] BIOS-e820: 0000000079336000 - 0000000079358000 (reserved) [ 0.000000] BIOS-e820: 0000000079358000 - 0000000079388000 (ACPI data) [ 0.000000] BIOS-e820: 0000000079388000 - 00000000793c9000 (reserved) [ 0.000000] BIOS-e820: 00000000793c9000 - 000000007968f000 (ACPI data) [ 0.000000] BIOS-e820: 000000007968f000 - 00000000796bb000 (reserved) [ 0.000000] BIOS-e820: 00000000796bb000 - 00000000799d8000 (ACPI data) [ 0.000000] BIOS-e820: 00000000799d8000 - 0000000079bd8000 (ACPI NVS) [ 0.000000] BIOS-e820: 0000000079bd8000 - 0000000079d87000 (ACPI data) [ 0.000000] BIOS-e820: 0000000079d87000 - 0000000079d8a000 (reserved) [ 0.000000] BIOS-e820: 0000000079d8a000 - 0000000079dca000 (ACPI data) [ 0.000000] BIOS-e820: 0000000079dca000 - 0000000079dcb000 (reserved) [ 0.000000] BIOS-e820: 0000000079dcb000 - 0000000079e1c000 (ACPI data) [ 0.000000] BIOS-e820: 0000000079e1c000 - 0000000079e87000 (reserved) [ 0.000000] BIOS-e820: 0000000079e87000 - 000000007bd5f000 (ACPI data) [ 0.000000] BIOS-e820: 000000007bd5f000 - 000000007be4f000 (reserved) [ 0.000000] BIOS-e820: 000000007be4f000 - 000000007bf87000 (ACPI data) so mmconf range is not reserved, and some ACPI data > 0000000078c59000-0000000078e6d000 (ACPI Non-volatile Storage) 0000000078c59000 - 0000000078c63000 get currupted... YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:44 ` Yinghai Lu @ 2009-12-15 19:48 ` Jens Axboe 2009-12-15 19:49 ` Yinghai Lu 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 19:48 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> Jens Axboe wrote: > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>> [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > >>>> > >>>> [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources > >>> On a "normal" non-kexec boot, I get: > >>> > >>> [ 12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > >>> [ 12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 > >>> [ 12.216874] PCI: Using configuration type 1 for base access > >>> > >> can you run following scripts in first kernel? > >> > >> cd /sys/firmware/memmap > >> for dir in * ; do > >> start=$(cat $dir/start) > >> end=$(cat $dir/end) > >> type=$(cat $dir/type) > >> printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt > >> done > >> > >> and send out /tmp/memmap.txt > > > > Below. > > > >> what is your kexec tools version? could be too old? > > > > It says: > > > > kexec-tools-testing 20080324 released 24th March 2008 > > > > > > 0000000000000000-0000000000098800 (System RAM) > > 0000000000098800-00000000000a0000 (reserved) > > 0000000079301000-0000000079303000 (reserved) > > 0000000079303000-0000000079305000 (ACPI Tables) > > 0000000079305000-0000000079310000 (reserved) > > 0000000079310000-0000000079314000 (ACPI Tables) > > 0000000079314000-0000000079319000 (reserved) > > 0000000079319000-0000000079336000 (ACPI Tables) > > 0000000079336000-0000000079358000 (reserved) > > 0000000079358000-0000000079388000 (ACPI Tables) > > 0000000079388000-00000000793c9000 (reserved) > > 00000000793c9000-000000007968f000 (ACPI Tables) > > 00000000000e0000-0000000000100000 (reserved) > > 000000007968f000-00000000796bb000 (reserved) > > 00000000796bb000-00000000799d8000 (ACPI Tables) > > 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage) > > 0000000079bd8000-0000000079d8b000 (ACPI Tables) > > 0000000079d8b000-0000000079d8c000 (reserved) > > 0000000079d8c000-0000000079dc8000 (ACPI Tables) > > 0000000079dc8000-0000000079dcb000 (reserved) > > 0000000079dcb000-0000000079e1c000 (ACPI Tables) > > 0000000079e1c000-0000000079e87000 (reserved) > > 0000000079e87000-000000007bd5f000 (ACPI Tables) > > 0000000000100000-0000000078c59000 (System RAM) > > 000000007bd5f000-000000007be4f000 (reserved) > > 000000007be4f000-000000007bf87000 (ACPI Tables) > > so following ranges are not passed to second kernel by kexec? I have the following addition to my kexec kernel command line: memmap=62G@4G since that last big 62G RAM entry doesn't show up without it, that's why you see a user defined e820 map as well in the boot logs. So a kexec'ed kernel is missing at least that entry. I just tried with the latest and greatest kexec-tools (2.0.1) and there's no difference. -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:48 ` Jens Axboe @ 2009-12-15 19:49 ` Yinghai Lu 2009-12-15 19:57 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 19:49 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> Jens Axboe wrote: >>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>> Jens Axboe wrote: >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>> [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) >>>>>> >>>>>> [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources >>>>> On a "normal" non-kexec boot, I get: >>>>> >>>>> [ 12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) >>>>> [ 12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 >>>>> [ 12.216874] PCI: Using configuration type 1 for base access >>>>> >>>> can you run following scripts in first kernel? >>>> >>>> cd /sys/firmware/memmap >>>> for dir in * ; do >>>> start=$(cat $dir/start) >>>> end=$(cat $dir/end) >>>> type=$(cat $dir/type) >>>> printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt >>>> done >>>> >>>> and send out /tmp/memmap.txt >>> Below. >>> >>>> what is your kexec tools version? could be too old? >>> It says: >>> >>> kexec-tools-testing 20080324 released 24th March 2008 >>> >>> >>> 0000000000000000-0000000000098800 (System RAM) >>> 0000000000098800-00000000000a0000 (reserved) >>> 0000000079301000-0000000079303000 (reserved) >>> 0000000079303000-0000000079305000 (ACPI Tables) >>> 0000000079305000-0000000079310000 (reserved) >>> 0000000079310000-0000000079314000 (ACPI Tables) >>> 0000000079314000-0000000079319000 (reserved) >>> 0000000079319000-0000000079336000 (ACPI Tables) >>> 0000000079336000-0000000079358000 (reserved) >>> 0000000079358000-0000000079388000 (ACPI Tables) >>> 0000000079388000-00000000793c9000 (reserved) >>> 00000000793c9000-000000007968f000 (ACPI Tables) >>> 00000000000e0000-0000000000100000 (reserved) >>> 000000007968f000-00000000796bb000 (reserved) >>> 00000000796bb000-00000000799d8000 (ACPI Tables) >>> 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage) >>> 0000000079bd8000-0000000079d8b000 (ACPI Tables) >>> 0000000079d8b000-0000000079d8c000 (reserved) >>> 0000000079d8c000-0000000079dc8000 (ACPI Tables) >>> 0000000079dc8000-0000000079dcb000 (reserved) >>> 0000000079dcb000-0000000079e1c000 (ACPI Tables) >>> 0000000079e1c000-0000000079e87000 (reserved) >>> 0000000079e87000-000000007bd5f000 (ACPI Tables) >>> 0000000000100000-0000000078c59000 (System RAM) >>> 000000007bd5f000-000000007be4f000 (reserved) >>> 000000007be4f000-000000007bf87000 (ACPI Tables) >> so following ranges are not passed to second kernel by kexec? > > I have the following addition to my kexec kernel command line: > > memmap=62G@4G > > since that last big 62G RAM entry doesn't show up without it, that's why > you see a user defined e820 map as well in the boot logs. So a kexec'ed > kernel is missing at least that entry. > > I just tried with the latest and greatest kexec-tools (2.0.1) and > there's no difference. current kernel kexec 2.6.32 make numa and mmconf working on second kernel? YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:49 ` Yinghai Lu @ 2009-12-15 19:57 ` Jens Axboe 0 siblings, 0 replies; 42+ messages in thread From: Jens Axboe @ 2009-12-15 19:57 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> Jens Axboe wrote: > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>> Jens Axboe wrote: > >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>>>> [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > >>>>>> > >>>>>> [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources > >>>>> On a "normal" non-kexec boot, I get: > >>>>> > >>>>> [ 12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > >>>>> [ 12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 > >>>>> [ 12.216874] PCI: Using configuration type 1 for base access > >>>>> > >>>> can you run following scripts in first kernel? > >>>> > >>>> cd /sys/firmware/memmap > >>>> for dir in * ; do > >>>> start=$(cat $dir/start) > >>>> end=$(cat $dir/end) > >>>> type=$(cat $dir/type) > >>>> printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt > >>>> done > >>>> > >>>> and send out /tmp/memmap.txt > >>> Below. > >>> > >>>> what is your kexec tools version? could be too old? > >>> It says: > >>> > >>> kexec-tools-testing 20080324 released 24th March 2008 > >>> > >>> > >>> 0000000000000000-0000000000098800 (System RAM) > >>> 0000000000098800-00000000000a0000 (reserved) > >>> 0000000079301000-0000000079303000 (reserved) > >>> 0000000079303000-0000000079305000 (ACPI Tables) > >>> 0000000079305000-0000000079310000 (reserved) > >>> 0000000079310000-0000000079314000 (ACPI Tables) > >>> 0000000079314000-0000000079319000 (reserved) > >>> 0000000079319000-0000000079336000 (ACPI Tables) > >>> 0000000079336000-0000000079358000 (reserved) > >>> 0000000079358000-0000000079388000 (ACPI Tables) > >>> 0000000079388000-00000000793c9000 (reserved) > >>> 00000000793c9000-000000007968f000 (ACPI Tables) > >>> 00000000000e0000-0000000000100000 (reserved) > >>> 000000007968f000-00000000796bb000 (reserved) > >>> 00000000796bb000-00000000799d8000 (ACPI Tables) > >>> 00000000799d8000-0000000079bd8000 (ACPI Non-volatile Storage) > >>> 0000000079bd8000-0000000079d8b000 (ACPI Tables) > >>> 0000000079d8b000-0000000079d8c000 (reserved) > >>> 0000000079d8c000-0000000079dc8000 (ACPI Tables) > >>> 0000000079dc8000-0000000079dcb000 (reserved) > >>> 0000000079dcb000-0000000079e1c000 (ACPI Tables) > >>> 0000000079e1c000-0000000079e87000 (reserved) > >>> 0000000079e87000-000000007bd5f000 (ACPI Tables) > >>> 0000000000100000-0000000078c59000 (System RAM) > >>> 000000007bd5f000-000000007be4f000 (reserved) > >>> 000000007be4f000-000000007bf87000 (ACPI Tables) > >> so following ranges are not passed to second kernel by kexec? > > > > I have the following addition to my kexec kernel command line: > > > > memmap=62G@4G > > > > since that last big 62G RAM entry doesn't show up without it, that's why > > you see a user defined e820 map as well in the boot logs. So a kexec'ed > > kernel is missing at least that entry. > > > > I just tried with the latest and greatest kexec-tools (2.0.1) and > > there's no difference. > > current kernel kexec 2.6.32 make numa and mmconf working on second kernel? Just tested that configuration, and with current -git booted and kexec into 2.6.32 gets me working numa but mmconf still complains: [ 15.669222] PCI: MCFG configuration 0: base 80000000 segment 0 buses 0 - 255 [ 15.677166] PCI: Not using MMCONFIG. [...] [ 15.971448] PCI: MCFG configuration 0: base 80000000 segment 0 buses 0 - 255 [ 16.066995] PCI: BIOS Bug: MCFG area at 80000000 is not reserved in ACPI motherboard resources [ 16.076705] PCI: Not using MMCONFIG. SRAT looks good: [...] [ 0.000000] SRAT: Node 0 PXM 0 0-80000000 [ 0.000000] SRAT: Node 0 PXM 0 100000000-480000000 [ 0.000000] SRAT: Node 2 PXM 1 480000000-880000000 [ 0.000000] SRAT: Node 1 PXM 2 880000000-c80000000 [ 0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000 [ 0.000000] NUMA: Using 31 for the hash shift. [snip same working NUMA config] -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:04 ` Yinghai Lu 2009-12-15 19:11 ` Jens Axboe @ 2009-12-15 21:30 ` Markus Trippelsdorf 2009-12-15 23:02 ` kexec boot regression radeon/kms (bisected) Markus Trippelsdorf 1 sibling, 1 reply; 42+ messages in thread From: Markus Trippelsdorf @ 2009-12-15 21:30 UTC (permalink / raw) To: Yinghai Lu Cc: Jens Axboe, Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1318 bytes --] On Tue, Dec 15, 2009 at 11:04:55AM -0800, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> [ 13.018720] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > >> > >> [ 13.100724] [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources > > > > On a "normal" non-kexec boot, I get: > > > > [ 12.173583] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) > > [ 12.184075] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 > > [ 12.216874] PCI: Using configuration type 1 for base access > > > > can you run following scripts in first kernel? > > cd /sys/firmware/memmap > for dir in * ; do > start=$(cat $dir/start) > end=$(cat $dir/end) > type=$(cat $dir/type) > printf "%016x-%016x (%s)\n" $start $[ $end +1] "$type" >> /tmp/memmap.txt > done > > and send out /tmp/memmap.txt > > what is your kexec tools version? could be too old? I have the same symptoms on my machine, but the underlying cause must be different. I once reverted all Radeon related changes since 2.6.32 and kexec started working again. Full dmesg and the output of the script is attached. kexec-tools 2.0.1 released 13th August 2009 -- Markus [-- Attachment #2: memmap.txt --] [-- Type: text/plain, Size: 431 bytes --] 0000000000000000-000000000009fc00 (System RAM) 000000000009fc00-00000000000a0000 (reserved) 00000000000e6000-0000000000100000 (reserved) 0000000000100000-00000000cbf90000 (System RAM) 00000000cbf90000-00000000cbfa8000 (ACPI Tables) 00000000cbfa8000-00000000cbfd0000 (ACPI Non-volatile Storage) 00000000cbfd0000-00000000cc000000 (reserved) 00000000fff00000-0000000100000000 (reserved) 0000000100000000-0000000130000000 (System RAM) [-- Attachment #3: dmesg --] [-- Type: text/plain, Size: 26916 bytes --] Linux version 2.6.32-07500-g8bea867-dirty (markus@arch.tripp.de) (gcc version 4.4.2 (GCC) ) #5 SMP Tue Dec 15 21:55:00 CET 2009 Command line: BOOT_IMAGE=/boot/kernel root=/dev/sdb fbcon=rotate:3 quiet BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000cbf90000 (usable) BIOS-e820: 00000000cbf90000 - 00000000cbfa8000 (ACPI data) BIOS-e820: 00000000cbfa8000 - 00000000cbfd0000 (ACPI NVS) BIOS-e820: 00000000cbfd0000 - 00000000cc000000 (reserved) BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000000130000000 (usable) NX (Execute Disable) protection: active DMI present. last_pfn = 0x130000 max_arch_pfn = 0x400000000 MTRR default type: uncachable MTRR fixed ranges enabled: 00000-9FFFF write-back A0000-EFFFF uncachable F0000-FFFFF write-protect MTRR variable ranges enabled: 0 base 000000000000 mask FFFF80000000 write-back 1 base 000080000000 mask FFFFC0000000 write-back 2 base 0000C0000000 mask FFFFF8000000 write-back 3 base 0000C8000000 mask FFFFFC000000 write-back 4 disabled 5 disabled 6 disabled 7 disabled TOM2: 0000000130000000 aka 4864M x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 e820 update range: 00000000cc000000 - 0000000100000000 (usable) ==> (reserved) last_pfn = 0xcbf90 max_arch_pfn = 0x400000000 initial memory mapped : 0 - 20000000 Using GB pages for direct mapping init_memory_mapping: 0000000000000000-00000000cbf90000 0000000000 - 00c0000000 page 1G 00c0000000 - 00cbe00000 page 2M 00cbe00000 - 00cbf90000 page 4k kernel direct mapping tables up to cbf90000 @ 8000-b000 init_memory_mapping: 0000000100000000-0000000130000000 0100000000 - 0130000000 page 2M kernel direct mapping tables up to 130000000 @ a000-c000 ACPI: RSDP 00000000000fb880 00024 (v02 ACPIAM) ACPI: XSDT 00000000cbf90100 00054 (v01 102809 XSDT1549 20091028 MSFT 00000097) ACPI: FACP 00000000cbf90290 000F4 (v03 102809 FACP1549 20091028 MSFT 00000097) ACPI Warning: Optional field Pm2ControlBlock has zero address or length: 0000000000000000/1 (20091112/tbfadt-557) ACPI: DSDT 00000000cbf90440 0E774 (v01 A1152 A1152000 00000000 INTL 20060113) ACPI: FACS 00000000cbfa8000 00040 ACPI: APIC 00000000cbf90390 0006C (v01 102809 APIC1549 20091028 MSFT 00000097) ACPI: MCFG 00000000cbf90400 0003C (v01 102809 OEMMCFG 20091028 MSFT 00000097) ACPI: OEMB 00000000cbfa8040 00072 (v01 102809 OEMB1549 20091028 MSFT 00000097) ACPI: HPET 00000000cbf9f440 00038 (v01 102809 OEMHPET 20091028 MSFT 00000097) ACPI: SSDT 00000000cbf9f480 0088C (v01 A M I POWERNOW 00000001 AMD 00000001) ACPI: Local APIC address 0xfee00000 (7 early reservations) ==> bootmem [0000000000 - 0130000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0001000000 - 000176e80c] TEXT DATA BSS ==> [0001000000 - 000176e80c] #2 [000009fc00 - 0000100000] BIOS reserved ==> [000009fc00 - 0000100000] #3 [000176f000 - 000176f290] BRK ==> [000176f000 - 000176f290] #4 [0000001000 - 0000003000] TRAMPOLINE ==> [0000001000 - 0000003000] #5 [0000008000 - 000000a000] PGTABLE ==> [0000008000 - 000000a000] #6 [000000a000 - 000000b000] PGTABLE ==> [000000a000 - 000000b000] [ffffea0000000000-ffffea00043fffff] PMD -> [ffff880028600000-ffff88002bffffff] on node 0 Zone PFN ranges: DMA 0x00000000 -> 0x00001000 DMA32 0x00001000 -> 0x00100000 Normal 0x00100000 -> 0x00130000 Movable zone start PFN for each node early_node_map[3] active PFN ranges 0: 0x00000000 -> 0x0000009f 0: 0x00000100 -> 0x000cbf90 0: 0x00100000 -> 0x00130000 On node 0 totalpages: 1031983 DMA zone: 56 pages used for memmap DMA zone: 102 pages reserved DMA zone: 3841 pages, LIFO batch:0 DMA32 zone: 14280 pages used for memmap DMA32 zone: 817096 pages, LIFO batch:31 Normal zone: 2688 pages used for memmap Normal zone: 193920 pages, LIFO batch:31 ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled) ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 4, version 33, address 0xfec00000, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Using ACPI (MADT) for SMP configuration information ACPI: HPET id: 0x8300 base: 0xfed00000 SMP: Allowing 4 CPUs, 0 hotplug CPUs nr_irqs_gsi: 24 Allocating PCI resources starting at cc000000 (gap: cc000000:33f00000) setup_percpu: NR_CPUS:4 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1 PERCPU: Embedded 25 pages/cpu @ffff880028200000 s81432 r0 d20968 u524288 pcpu-alloc: s81432 r0 d20968 u524288 alloc=1*2097152 pcpu-alloc: [0] 0 1 2 3 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 1014857 Kernel command line: BOOT_IMAGE=/boot/kernel root=/dev/sdb fbcon=rotate:3 quiet PID hash table entries: 4096 (order: 3, 32768 bytes) Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes) Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes) Memory: 3987284k/4980736k available (3765k kernel code, 852804k absent, 139720k reserved, 2841k data, 416k init) SLUB: Genslabs=13, HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 Hierarchical RCU implementation. NR_IRQS:384 Extended CMOS year: 2000 spurious 8259A interrupt: IRQ7. Console: colour VGA+ 80x25 console [tty0] enabled hpet clockevent registered Fast TSC calibration using PIT Detected 3210.336 MHz processor. Calibrating delay loop (skipped), value calculated using timer frequency.. 6420.66 BogoMIPS (lpj=3210332) Mount-cache hash table entries: 256 tseg: 0000000000 CPU: Physical Processor ID: 0 CPU: Processor Core ID: 0 mce: CPU supports 6 MCE banks using C1E aware idle routine Performance Events: AMD PMU driver. ... version: 0 ... bit width: 48 ... generic registers: 4 ... value mask: 0000ffffffffffff ... max period: 00007fffffffffff ... fixed-purpose events: 0 ... event mask: 000000000000000f Freeing SMP alternatives: 28k freed ACPI: Core revision 20091112 Setting APIC routing to flat ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 CPU0: AMD Phenom(tm) II X4 955 Processor stepping 02 Booting Node 0, Processors #1 System has AMD C1E enabled Switch to broadcast mode on CPU1 #2 Switch to broadcast mode on CPU2 #3 Ok. Brought up 4 CPUs Total of 4 processors activated (25686.38 BogoMIPS). Switch to broadcast mode on CPU3 Switch to broadcast mode on CPU0 NET: Registered protocol family 16 node 0 link 0: io port [1000, ffffff] TOM: 00000000d0000000 aka 3328M Fam 10h mmconf [e0000000, efffffff] node 0 link 0: mmio [a0000, bffff] node 0 link 0: mmio [d0000000, efffffff] ==> [d0000000, dfffffff] node 0 link 0: mmio [f0000000, fbcfffff] node 0 link 0: mmio [fbd00000, fbefffff] node 0 link 0: mmio [fbf00000, ffefffff] TOM2: 0000000130000000 aka 4864M bus: [00, 07] on node 0 link 0 bus: 00 index 0 io port: [0, ffff] bus: 00 index 1 mmio: [a0000, bffff] bus: 00 index 2 mmio: [d0000000, dfffffff] bus: 00 index 3 mmio: [f0000000, ffffffff] bus: 00 index 4 mmio: [130000000, fcffffffff] ACPI: bus type pci registered PCI: Using configuration type 1 for base access PCI: Using configuration type 1 for extended access mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs. mtrr: corrected configuration. bio: create slab <bio-0> at 0 ACPI: EC: Look up EC in DSDT ACPI: Executed 3 blocks of module-level executable AML code ACPI: Interpreter enabled ACPI: (supports S0 S5) ACPI: Using IOAPIC for interrupt routing ACPI Warning: Incorrect checksum in table [OEMB] - B2, should be AA (20091112/tbutils-314) ACPI: PCI Root Bridge [PCI0] (0000:00) pci_root PNP0A03:00: ignoring host bridge windows from ACPI; boot with "pci=use_crs" to use them pci_root PNP0A03:00: host bridge window [io 0x0000-0x0cf7] (ignored) pci_root PNP0A03:00: host bridge window [io 0x0d00-0xffff] (ignored) pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff] (ignored) pci_root PNP0A03:00: host bridge window [mem 0x000d0000-0x000dffff] (ignored) pci_root PNP0A03:00: host bridge window [mem 0xcc000000-0xdfffffff] (ignored) pci_root PNP0A03:00: host bridge window [mem 0xf0000000-0xfebfffff] (ignored) pci 0000:00:11.0: reg 10: [io 0xc000-0xc007] pci 0000:00:11.0: reg 14: [io 0xb000-0xb003] pci 0000:00:11.0: reg 18: [io 0xa000-0xa007] pci 0000:00:11.0: reg 1c: [io 0x9000-0x9003] pci 0000:00:11.0: reg 20: [io 0x8000-0x800f] pci 0000:00:11.0: reg 24: [mem 0xfbcffc00-0xfbcfffff] pci 0000:00:11.0: set SATA to AHCI mode pci 0000:00:12.0: reg 10: [mem 0xfbcfd000-0xfbcfdfff] pci 0000:00:12.1: reg 10: [mem 0xfbcfe000-0xfbcfefff] pci 0000:00:12.2: reg 10: [mem 0xfbcff800-0xfbcff8ff] pci 0000:00:12.2: supports D1 D2 pci 0000:00:12.2: PME# supported from D0 D1 D2 D3hot pci 0000:00:12.2: PME# disabled pci 0000:00:13.0: reg 10: [mem 0xfbcfb000-0xfbcfbfff] pci 0000:00:13.1: reg 10: [mem 0xfbcfc000-0xfbcfcfff] pci 0000:00:13.2: reg 10: [mem 0xfbcff400-0xfbcff4ff] pci 0000:00:13.2: supports D1 D2 pci 0000:00:13.2: PME# supported from D0 D1 D2 D3hot pci 0000:00:13.2: PME# disabled pci 0000:00:14.1: reg 10: [io 0x0000-0x0007] pci 0000:00:14.1: reg 14: [io 0x0000-0x0003] pci 0000:00:14.1: reg 18: [io 0x0000-0x0007] pci 0000:00:14.1: reg 1c: [io 0x0000-0x0003] pci 0000:00:14.1: reg 20: [io 0xff00-0xff0f] pci 0000:00:14.5: reg 10: [mem 0xfbcfa000-0xfbcfafff] pci 0000:01:05.0: reg 10: [mem 0xd0000000-0xdfffffff pref] pci 0000:01:05.0: reg 14: [io 0xd000-0xd0ff] pci 0000:01:05.0: reg 18: [mem 0xfbee0000-0xfbeeffff] pci 0000:01:05.0: reg 24: [mem 0xfbd00000-0xfbdfffff] pci 0000:01:05.0: supports D1 D2 pci 0000:01:05.1: reg 10: [mem 0xfbefc000-0xfbefffff] pci 0000:01:05.1: supports D1 D2 pci 0000:00:01.0: PCI bridge to [bus 01-01] pci 0000:00:01.0: bridge window [io 0xd000-0xdfff] pci 0000:00:01.0: bridge window [mem 0xfbd00000-0xfbefffff] pci 0000:00:01.0: bridge window [mem 0xd0000000-0xdfffffff 64bit pref] pci 0000:02:05.0: reg 10: [io 0xe800-0xe8ff] pci 0000:02:05.0: reg 14: [mem 0xfbfffc00-0xfbfffcff] pci 0000:02:05.0: reg 30: [mem 0xfbfc0000-0xfbfdffff pref] pci 0000:02:05.0: supports D1 D2 pci 0000:02:05.0: PME# supported from D1 D2 D3hot D3cold pci 0000:02:05.0: PME# disabled pci 0000:00:14.4: PCI bridge to [bus 02-02] (subtractive decode) pci 0000:00:14.4: bridge window [io 0xe000-0xefff] pci 0000:00:14.4: bridge window [mem 0xfbf00000-0xfbffffff] pci_bus 0000:00: on NUMA node 0 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0PC._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs *4 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 4 *7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 4 7 *10 11 12 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 4 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKE] (IRQs 4 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKF] (IRQs 4 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKG] (IRQs 4 7 *10 11 12 14 15) ACPI: PCI Interrupt Link [LNKH] (IRQs 4 7 10 11 12 14 15) *0, disabled. vgaarb: device added: PCI:0000:01:05.0,decodes=io+mem,owns=io+mem,locks=none vgaarb: loaded SCSI subsystem initialized libata version 3.00 loaded. usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: Using ACPI for IRQ routing PCI: pci_cache_line_size set to 64 bytes HPET: 4 timers in total, 1 timers will be used for per-cpu timer hpet0: at MMIO 0xfed00000, IRQs 2, 8, 24, 0 hpet0: 4 comparators, 32-bit 14.318180 MHz counter hpet: hpet2 irq 24 for MSI Switching to clocksource tsc pnp: PnP ACPI init ACPI: bus type pnp registered pnp: PnP ACPI: found 13 devices ACPI: ACPI bus type pnp unregistered system 00:01: [mem 0xcc000000-0xcfffffff] has been reserved system 00:07: [mem 0xfec00000-0xfec00fff] could not be reserved system 00:07: [mem 0xfee00000-0xfee00fff] has been reserved system 00:08: [io 0x04d0-0x04d1] has been reserved system 00:08: [io 0x040b] has been reserved system 00:08: [io 0x04d6] has been reserved system 00:08: [io 0x0c00-0x0c01] has been reserved system 00:08: [io 0x0c14] has been reserved system 00:08: [io 0x0c50-0x0c51] has been reserved system 00:08: [io 0x0c52] has been reserved system 00:08: [io 0x0c6c] has been reserved system 00:08: [io 0x0c6f] has been reserved system 00:08: [io 0x0cd0-0x0cd1] has been reserved system 00:08: [io 0x0cd2-0x0cd3] has been reserved system 00:08: [io 0x0cd4-0x0cd5] has been reserved system 00:08: [io 0x0cd6-0x0cd7] has been reserved system 00:08: [io 0x0cd8-0x0cdf] has been reserved system 00:08: [io 0x0b00-0x0b3f] has been reserved system 00:08: [io 0x0800-0x089f] has been reserved system 00:08: [io 0x0b00-0x0b0f] has been reserved system 00:08: [io 0x0b20-0x0b3f] has been reserved system 00:08: [io 0x0900-0x090f] has been reserved system 00:08: [io 0x0910-0x091f] has been reserved system 00:08: [io 0xfe00-0xfefe] has been reserved system 00:08: [mem 0xffb80000-0xffbfffff] has been reserved system 00:08: [mem 0xfec10000-0xfec1001f] has been reserved system 00:0a: [io 0x0230-0x023f] has been reserved system 00:0a: [io 0x0290-0x029f] has been reserved system 00:0a: [io 0x0f40-0x0f4f] has been reserved system 00:0a: [io 0x0a30-0x0a3f] has been reserved system 00:0b: [mem 0xe0000000-0xefffffff] has been reserved system 00:0c: [mem 0x00000000-0x0009ffff] could not be reserved system 00:0c: [mem 0x000c0000-0x000cffff] has been reserved system 00:0c: [mem 0x000e0000-0x000fffff] could not be reserved system 00:0c: [mem 0x00100000-0xcbffffff] could not be reserved system 00:0c: [mem 0xfec00000-0xffffffff] could not be reserved pci 0000:00:01.0: PCI bridge to [bus 01-01] pci 0000:00:01.0: bridge window [io 0xd000-0xdfff] pci 0000:00:01.0: bridge window [mem 0xfbd00000-0xfbefffff] pci 0000:00:01.0: bridge window [mem 0xd0000000-0xdfffffff 64bit pref] pci 0000:00:14.4: PCI bridge to [bus 02-02] pci 0000:00:14.4: bridge window [io 0xe000-0xefff] pci 0000:00:14.4: bridge window [mem 0xfbf00000-0xfbffffff] pci 0000:00:14.4: bridge window [mem pref disabled] pci_bus 0000:00: resource 0 [io 0x0000-0xffff] pci_bus 0000:00: resource 1 [mem 0x00000000-0xffffffffffffffff] pci_bus 0000:01: resource 0 [io 0xd000-0xdfff] pci_bus 0000:01: resource 1 [mem 0xfbd00000-0xfbefffff] pci_bus 0000:01: resource 2 [mem 0xd0000000-0xdfffffff 64bit pref] pci_bus 0000:02: resource 0 [io 0xe000-0xefff] pci_bus 0000:02: resource 1 [mem 0xfbf00000-0xfbffffff] pci_bus 0000:02: resource 3 [io 0x0000-0xffff] pci_bus 0000:02: resource 4 [mem 0x00000000-0xffffffffffffffff] NET: Registered protocol family 2 IP route cache hash table entries: 131072 (order: 8, 1048576 bytes) TCP established hash table entries: 262144 (order: 10, 4194304 bytes) TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) TCP: Hash tables configured (established 262144 bind 65536) TCP reno registered UDP hash table entries: 2048 (order: 4, 65536 bytes) UDP-Lite hash table entries: 2048 (order: 4, 65536 bytes) NET: Registered protocol family 1 pci 0000:01:05.0: Boot video device PCI: CLS 64 bytes, default 64 PCI-DMA: Using software bounce buffering for IO (SWIOTLB) Placing 64MB software IO TLB between ffff88002c600000 - ffff880030600000 software IO TLB at phys 0x2c600000 - 0x30600000 kvm: Nested Virtualization enabled kvm: Nested Paging enabled msgmni has been set to 7789 Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254) io scheduler noop registered io scheduler cfq registered (default) input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0 ACPI: Power Button [PWRB] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input1 ACPI: Power Button [PWRF] processor LNXCPU:00: registered as cooling_device0 processor LNXCPU:01: registered as cooling_device1 processor LNXCPU:02: registered as cooling_device2 processor LNXCPU:03: registered as cooling_device3 Real Time Clock Driver v1.12b Linux agpgart interface v0.103 [drm] Initialized drm 1.1.0 20060810 [drm] radeon defaulting to kernel modesetting. [drm] radeon kernel modesetting enabled. radeon 0000:01:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 radeon 0000:01:05.0: setting latency timer to 64 [drm] radeon: Initializing kernel modesetting. [drm] register mmio base: 0xFBEE0000 [drm] register mmio size: 65536 ATOM BIOS: 113 [drm] Clocks initialized ! [drm] Detected VRAM RAM=192M, BAR=256M [drm] RAM width 32bits DDR [TTM] Zone kernel: Available graphics memory: 1994122 kiB. [drm] radeon: 192M of VRAM memory ready [drm] radeon: 512M of GTT memory ready. [drm] radeon: irq initialized. [drm] GART: num cpu pages 131072, num gpu pages 131072 [drm] Loading RS780 Microcode platform radeon_cp.0: firmware: using built-in firmware radeon/RS780_pfp.bin platform radeon_cp.0: firmware: using built-in firmware radeon/RS780_me.bin platform radeon_cp.0: firmware: using built-in firmware radeon/R600_rlc.bin [drm] ring test succeeded in 1 usecs [drm] radeon: ib pool ready. [drm] ib test succeeded in 0 usecs [drm] Radeon Display Connectors [drm] Connector 0: [drm] VGA [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c [drm] Encoders: [drm] CRT1: INTERNAL_KLDSCP_DAC1 [drm] Connector 1: [drm] DVI-D [drm] HPD3 [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c [drm] Encoders: [drm] DFP3: INTERNAL_KLDSCP_LVTMA [drm] fb mappable at 0xD0141000 [drm] vram apper at 0xD0000000 [drm] size 7257600 [drm] fb depth is 24 [drm] pitch is 6912 executing set pll executing set crtc timing [drm] TMDS-11: set mode 1680x1050 1d Console: switching to colour frame buffer device 131x105 fb0: radeondrmfb frame buffer device registered panic notifier [drm] Initialized radeon 2.0.0 20080528 for 0000:01:05.0 on minor 0 loop: module loaded ahci 0000:00:11.0: version 3.0 ahci 0000:00:11.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22 ahci 0000:00:11.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part ccc scsi0 : ahci scsi1 : ahci scsi2 : ahci scsi3 : ahci ata1: SATA max UDMA/133 irq_stat 0x00400000, PHY RDY changed ata2: SATA max UDMA/133 abar m1024@0xfbcffc00 port 0xfbcffd80 irq 22 ata3: SATA max UDMA/133 abar m1024@0xfbcffc00 port 0xfbcffe00 irq 22 ata4: SATA max UDMA/133 abar m1024@0xfbcffc00 port 0xfbcffe80 irq 22 pata_atiixp 0000:00:14.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16 pata_atiixp 0000:00:14.1: setting latency timer to 64 scsi4 : pata_atiixp scsi5 : pata_atiixp ata5: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xff00 irq 14 ata6: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xff08 irq 15 tun: Universal TUN/TAP device driver, 1.6 tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com> r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded r8169 0000:02:05.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20 r8169 0000:02:05.0: no PCI Express capability eth0: RTL8110s at 0xffffc90000454c00, 00:08:54:36:f2:2f, XID 04000000 IRQ 20 ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ehci_hcd 0000:00:12.2: PCI INT B -> GSI 17 (level, low) -> IRQ 17 ehci_hcd 0000:00:12.2: EHCI Host Controller ehci_hcd 0000:00:12.2: new USB bus registered, assigned bus number 1 ehci_hcd 0000:00:12.2: applying AMD SB600/SB700 USB freeze workaround ehci_hcd 0000:00:12.2: debug port 1 ehci_hcd 0000:00:12.2: irq 17, io mem 0xfbcff800 ehci_hcd 0000:00:12.2: USB 2.0 started, EHCI 1.00 hub 1-0:1.0: USB hub found hub 1-0:1.0: 6 ports detected ehci_hcd 0000:00:13.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19 ehci_hcd 0000:00:13.2: EHCI Host Controller ehci_hcd 0000:00:13.2: new USB bus registered, assigned bus number 2 ehci_hcd 0000:00:13.2: applying AMD SB600/SB700 USB freeze workaround ehci_hcd 0000:00:13.2: debug port 1 ehci_hcd 0000:00:13.2: irq 19, io mem 0xfbcff400 ehci_hcd 0000:00:13.2: USB 2.0 started, EHCI 1.00 hub 2-0:1.0: USB hub found hub 2-0:1.0: 6 ports detected ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver ohci_hcd 0000:00:12.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 ohci_hcd 0000:00:12.0: OHCI Host Controller ohci_hcd 0000:00:12.0: new USB bus registered, assigned bus number 3 ohci_hcd 0000:00:12.0: irq 16, io mem 0xfbcfd000 hub 3-0:1.0: USB hub found hub 3-0:1.0: 3 ports detected ohci_hcd 0000:00:12.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16 ohci_hcd 0000:00:12.1: OHCI Host Controller ohci_hcd 0000:00:12.1: new USB bus registered, assigned bus number 4 ohci_hcd 0000:00:12.1: irq 16, io mem 0xfbcfe000 hub 4-0:1.0: USB hub found hub 4-0:1.0: 3 ports detected ohci_hcd 0000:00:13.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 ohci_hcd 0000:00:13.0: OHCI Host Controller ohci_hcd 0000:00:13.0: new USB bus registered, assigned bus number 5 ohci_hcd 0000:00:13.0: irq 18, io mem 0xfbcfb000 ata5.00: ATAPI: HL-DT-STDVD-RAM GH22NP20, 1.03, max UDMA/66 ata5.00: configured for UDMA/66 hub 5-0:1.0: USB hub found hub 5-0:1.0: 3 ports detected ohci_hcd 0000:00:13.1: PCI INT A -> GSI 18 (level, low) -> IRQ 18 ohci_hcd 0000:00:13.1: OHCI Host Controller ohci_hcd 0000:00:13.1: new USB bus registered, assigned bus number 6 ohci_hcd 0000:00:13.1: irq 18, io mem 0xfbcfc000 hub 6-0:1.0: USB hub found hub 6-0:1.0: 3 ports detected ohci_hcd 0000:00:14.5: PCI INT C -> GSI 18 (level, low) -> IRQ 18 ohci_hcd 0000:00:14.5: OHCI Host Controller ohci_hcd 0000:00:14.5: new USB bus registered, assigned bus number 7 ohci_hcd 0000:00:14.5: irq 18, io mem 0xfbcfa000 hub 7-0:1.0: USB hub found hub 7-0:1.0: 2 ports detected Initializing USB Mass Storage driver... usbcore: registered new interface driver usb-storage USB Mass Storage support registered. PNP: No PS/2 controller found. Probing ports directly. serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mice: PS/2 mouse device common for all mice i2c /dev entries driver cpuidle: using governor ladder cpuidle: using governor menu usbcore: registered new interface driver usbhid usbhid: USB HID core driver Advanced Linux Sound Architecture Driver Version 1.0.21. usbcore: registered new interface driver snd-usb-audio ALSA device list: No soundcards found. Netfilter messages via NETLINK v0.30. nf_conntrack version 0.5.0 (16384 buckets, 65536 max) ctnetlink v0.93: registering with nfnetlink. ip_tables: (C) 2000-2006 Netfilter Core Team TCP cubic registered NET: Registered protocol family 17 powernow-k8: Found 1 AMD Phenom(tm) II X4 955 Processor processors (4 cpu cores) (version 2.20.00) powernow-k8: 0 : pstate 0 (3200 MHz) powernow-k8: 1 : pstate 1 (2500 MHz) powernow-k8: 2 : pstate 2 (2100 MHz) powernow-k8: 3 : pstate 3 (800 MHz) registered taskstats version 1 ata2: SATA link down (SStatus 0 SControl 300) ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata4: SATA link down (SStatus 0 SControl 300) ata3.00: ATA-8: OCZ-VERTEX, 1.4, max UDMA/133 ata3.00: 62533296 sectors, multi 1: LBA48 NCQ (depth 31/32), AA ata3.00: configured for UDMA/133 ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-7: SAMSUNG HD103UJ, 1AA01118, max UDMA7 ata1.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA usb 4-1: new full speed USB device using ohci_hcd and address 2 ata1.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB) sd 0:0:0:0: Attached scsi generic sg0 type 0 sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA scsi 2:0:0:0: Direct-Access ATA OCZ-VERTEX 1.4 PQ: 0 ANSI: 5 sda: sd 2:0:0:0: [sdb] 62533296 512-byte logical blocks: (32.0 GB/29.8 GiB) sd 2:0:0:0: [sdb] Write Protect is off sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 2:0:0:0: Attached scsi generic sg1 type 0 sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: unknown partition table sd 2:0:0:0: [sdb] Attached SCSI disk scsi 4:0:0:0: CD-ROM HL-DT-ST DVD-RAM GH22NP20 1.03 PQ: 0 ANSI: 5 sda1 sda2 sda3 sr0: scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray Uniform CD-ROM driver Revision: 3.20 sd 0:0:0:0: [sda] Attached SCSI disk sr 4:0:0:0: Attached scsi CD-ROM sr0 sr 4:0:0:0: Attached scsi generic sg2 type 5 EXT4-fs (sdb): INFO: recovery required on readonly filesystem EXT4-fs (sdb): write access will be enabled during recovery EXT4-fs (sdb): recovery complete EXT4-fs (sdb): mounted filesystem with ordered data mode VFS: Mounted root (ext4 filesystem) readonly on device 8:16. Freeing unused kernel memory: 416k freed Write protecting the kernel read-only data: 6144k Freeing unused kernel memory: 324k freed Freeing unused kernel memory: 496k freed input: C-Media USB Headphone Set as /devices/pci0000:00/0000:00:12.1/usb4/4-1/4-1:1.3/input/input2 generic-usb 0003:0D8C:000C.0001: input: USB HID v1.00 Device [C-Media USB Headphone Set ] on usb-0000:00:12.1-1/input3 udev: starting version 146 usb 3-1: new full speed USB device using ohci_hcd and address 2 input: Logitech USB Receiver as /devices/pci0000:00/0000:00:12.0/usb3/3-1/3-1:1.0/input/input3 generic-usb 0003:046D:C526.0002: input: USB HID v1.11 Mouse [Logitech USB Receiver] on usb-0000:00:12.0-1/input0 input: Logitech USB Receiver as /devices/pci0000:00/0000:00:12.0/usb3/3-1/3-1:1.1/input/input4 generic-usb 0003:046D:C526.0003: input: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:00:12.0-1/input1 usb 3-2: new low speed USB device using ohci_hcd and address 3 input: HID 046a:0021 as /devices/pci0000:00/0000:00:12.0/usb3/3-2/3-2:1.0/input/input5 generic-usb 0003:046A:0021.0004: input: USB HID v1.11 Keyboard [HID 046a:0021] on usb-0000:00:12.0-2/input0 input: HID 046a:0021 as /devices/pci0000:00/0000:00:12.0/usb3/3-2/3-2:1.1/input/input6 generic-usb 0003:046A:0021.0005: input: USB HID v1.11 Device [HID 046a:0021] on usb-0000:00:12.0-2/input1 EXT4-fs (sda1): mounted filesystem with ordered data mode EXT4-fs (sda2): mounted filesystem with ordered data mode EXT4-fs (sda3): mounted filesystem with ordered data mode Adding 255992k swap on /var/cache/swap/swapfile. Priority:-1 extents:2 across:354296k r8169: eth0: link up executing set pll executing set crtc timing [drm] TMDS-11: set mode 1680x1050 1d ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression radeon/kms (bisected) 2009-12-15 21:30 ` Markus Trippelsdorf @ 2009-12-15 23:02 ` Markus Trippelsdorf 0 siblings, 0 replies; 42+ messages in thread From: Markus Trippelsdorf @ 2009-12-15 23:02 UTC (permalink / raw) To: Yinghai Lu Cc: Jens Axboe, Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, Alex Deucher, Dave Airlie On Tue, Dec 15, 2009 at 10:30:21PM +0100, Markus Trippelsdorf wrote: > I have the same symptoms on my machine, but the underlying cause must be > different. I once reverted all Radeon related changes since 2.6.32 and > kexec started working again. > OK, I bisected this down to: d8f60cfc93452d0554f6a701aa8e3236cbee4636 is the first bad commit commit d8f60cfc93452d0554f6a701aa8e3236cbee4636 Author: Alex Deucher <alexdeucher@gmail.com> Date: Tue Dec 1 13:43:46 2009 -0500 drm/radeon/kms: Add support for interrupts on r6xx/r7xx chips (v3) -- Markus ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 18:39 ` Yinghai Lu ` (2 preceding siblings ...) 2009-12-15 18:59 ` Jens Axboe @ 2009-12-15 19:43 ` Jens Axboe 2009-12-15 19:48 ` Yinghai Lu 3 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 19:43 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Yinghai Lu wrote: > [PATCH] x86/pci: intel ioh bus num reg accessing fix > > it is above 0x100, so if mmconf is not enable, need to skip it This works, it kexecs kernels fine. But since 2.6.32 doesn't have the mmconf problem to begin with, are we now just working around the issue? SRAT still reports issues, numa doesn't work. -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:43 ` kexec boot regression Jens Axboe @ 2009-12-15 19:48 ` Yinghai Lu 2009-12-15 19:51 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 19:48 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> [PATCH] x86/pci: intel ioh bus num reg accessing fix >> >> it is above 0x100, so if mmconf is not enable, need to skip it > > This works, it kexecs kernels fine. But since 2.6.32 doesn't have the > mmconf problem to begin with, are we now just working around the issue? > SRAT still reports issues, numa doesn't work. that patch will be bullet proof... we need it. also still need to figure out why memmap range is not passed properly. do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in second kernel? YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:48 ` Yinghai Lu @ 2009-12-15 19:51 ` Jens Axboe 2009-12-15 19:56 ` Yinghai Lu 2009-12-15 20:14 ` Yinghai Lu 0 siblings, 2 replies; 42+ messages in thread From: Jens Axboe @ 2009-12-15 19:51 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> [PATCH] x86/pci: intel ioh bus num reg accessing fix > >> > >> it is above 0x100, so if mmconf is not enable, need to skip it > > > > This works, it kexecs kernels fine. But since 2.6.32 doesn't have the > > mmconf problem to begin with, are we now just working around the issue? > > SRAT still reports issues, numa doesn't work. > > that patch will be bullet proof... we need it. > > also still need to figure out why memmap range is not passed properly. > > do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in > second kernel? Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT complaints and NUMA works fine. -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:51 ` Jens Axboe @ 2009-12-15 19:56 ` Yinghai Lu 2009-12-15 20:09 ` Jens Axboe 2009-12-15 20:14 ` Yinghai Lu 1 sibling, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 19:56 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> Jens Axboe wrote: >>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix >>>> >>>> it is above 0x100, so if mmconf is not enable, need to skip it >>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the >>> mmconf problem to begin with, are we now just working around the issue? >>> SRAT still reports issues, numa doesn't work. >> that patch will be bullet proof... we need it. >> >> also still need to figure out why memmap range is not passed properly. >> >> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in >> second kernel? > > Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT > complaints and NUMA works fine. > how about current kernel booted and 2.6.32 kexec'ed works just fine, no SRAT complaints and NUMA works fine. ? YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:56 ` Yinghai Lu @ 2009-12-15 20:09 ` Jens Axboe 0 siblings, 0 replies; 42+ messages in thread From: Jens Axboe @ 2009-12-15 20:09 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> Jens Axboe wrote: > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix > >>>> > >>>> it is above 0x100, so if mmconf is not enable, need to skip it > >>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the > >>> mmconf problem to begin with, are we now just working around the issue? > >>> SRAT still reports issues, numa doesn't work. > >> that patch will be bullet proof... we need it. > >> > >> also still need to figure out why memmap range is not passed properly. > >> > >> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in > >> second kernel? > > > > Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT > > complaints and NUMA works fine. > > > how about > > current kernel booted and 2.6.32 kexec'ed works just fine, no SRAT > complaints and NUMA works fine. ? Yes, that's exactly what happens, see the previous reply I sent. mmconf still complains, though. -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 19:51 ` Jens Axboe 2009-12-15 19:56 ` Yinghai Lu @ 2009-12-15 20:14 ` Yinghai Lu 2009-12-15 20:19 ` Jens Axboe 1 sibling, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 20:14 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> Jens Axboe wrote: >>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix >>>> >>>> it is above 0x100, so if mmconf is not enable, need to skip it >>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the >>> mmconf problem to begin with, are we now just working around the issue? >>> SRAT still reports issues, numa doesn't work. >> that patch will be bullet proof... we need it. >> >> also still need to figure out why memmap range is not passed properly. >> >> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in >> second kernel? > > Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT > complaints and NUMA works fine. do you need memmap=62G@4G in this case? YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 20:14 ` Yinghai Lu @ 2009-12-15 20:19 ` Jens Axboe 2009-12-15 20:21 ` Yinghai Lu 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 20:19 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> Jens Axboe wrote: > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix > >>>> > >>>> it is above 0x100, so if mmconf is not enable, need to skip it > >>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the > >>> mmconf problem to begin with, are we now just working around the issue? > >>> SRAT still reports issues, numa doesn't work. > >> that patch will be bullet proof... we need it. > >> > >> also still need to figure out why memmap range is not passed properly. > >> > >> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in > >> second kernel? > > > > Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT > > complaints and NUMA works fine. > > do you need > memmap=62G@4G > in this case? Yes, I've needed that always. -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 20:19 ` Jens Axboe @ 2009-12-15 20:21 ` Yinghai Lu 2009-12-15 20:42 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 20:21 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> Jens Axboe wrote: >>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>> Jens Axboe wrote: >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix >>>>>> >>>>>> it is above 0x100, so if mmconf is not enable, need to skip it >>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the >>>>> mmconf problem to begin with, are we now just working around the issue? >>>>> SRAT still reports issues, numa doesn't work. >>>> that patch will be bullet proof... we need it. >>>> >>>> also still need to figure out why memmap range is not passed properly. >>>> >>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in >>>> second kernel? >>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT >>> complaints and NUMA works fine. >> do you need >> memmap=62G@4G >> in this case? > > Yes, I've needed that always. good, can you enable debug option in kexec to see why kexec can not pass whole 38? range to second kernel? YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 20:21 ` Yinghai Lu @ 2009-12-15 20:42 ` Jens Axboe 2009-12-15 20:55 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 20:42 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > >> Jens Axboe wrote: > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>> Jens Axboe wrote: > >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix > >>>>>> > >>>>>> it is above 0x100, so if mmconf is not enable, need to skip it > >>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the > >>>>> mmconf problem to begin with, are we now just working around the issue? > >>>>> SRAT still reports issues, numa doesn't work. > >>>> that patch will be bullet proof... we need it. > >>>> > >>>> also still need to figure out why memmap range is not passed properly. > >>>> > >>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in > >>>> second kernel? > >>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT > >>> complaints and NUMA works fine. > >> do you need > >> memmap=62G@4G > >> in this case? > > > > Yes, I've needed that always. > > good, > > can you enable debug option in kexec to see why kexec can not pass > whole 38? range to second kernel? Not getting any output so far, -d doesn't do much. Poking around in the source... -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 20:42 ` Jens Axboe @ 2009-12-15 20:55 ` Jens Axboe 2009-12-15 21:01 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 20:55 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying On Tue, Dec 15 2009, Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: > > Jens Axboe wrote: > > > On Tue, Dec 15 2009, Yinghai Lu wrote: > > >> Jens Axboe wrote: > > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > > >>>> Jens Axboe wrote: > > >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > > >>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix > > >>>>>> > > >>>>>> it is above 0x100, so if mmconf is not enable, need to skip it > > >>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the > > >>>>> mmconf problem to begin with, are we now just working around the issue? > > >>>>> SRAT still reports issues, numa doesn't work. > > >>>> that patch will be bullet proof... we need it. > > >>>> > > >>>> also still need to figure out why memmap range is not passed properly. > > >>>> > > >>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in > > >>>> second kernel? > > >>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT > > >>> complaints and NUMA works fine. > > >> do you need > > >> memmap=62G@4G > > >> in this case? > > > > > > Yes, I've needed that always. > > > > good, > > > > can you enable debug option in kexec to see why kexec can not pass > > whole 38? range to second kernel? > > Not getting any output so far, -d doesn't do much. Poking around in the > source... OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges total), that smells like just a kexec bug. Retesting -git... -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 20:55 ` Jens Axboe @ 2009-12-15 21:01 ` Jens Axboe 2009-12-15 21:26 ` Yinghai Lu 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 21:01 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying On Tue, Dec 15 2009, Jens Axboe wrote: > On Tue, Dec 15 2009, Jens Axboe wrote: > > On Tue, Dec 15 2009, Yinghai Lu wrote: > > > Jens Axboe wrote: > > > > On Tue, Dec 15 2009, Yinghai Lu wrote: > > > >> Jens Axboe wrote: > > > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > > > >>>> Jens Axboe wrote: > > > >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > > > >>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix > > > >>>>>> > > > >>>>>> it is above 0x100, so if mmconf is not enable, need to skip it > > > >>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the > > > >>>>> mmconf problem to begin with, are we now just working around the issue? > > > >>>>> SRAT still reports issues, numa doesn't work. > > > >>>> that patch will be bullet proof... we need it. > > > >>>> > > > >>>> also still need to figure out why memmap range is not passed properly. > > > >>>> > > > >>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in > > > >>>> second kernel? > > > >>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT > > > >>> complaints and NUMA works fine. > > > >> do you need > > > >> memmap=62G@4G > > > >> in this case? > > > > > > > > Yes, I've needed that always. > > > > > > good, > > > > > > can you enable debug option in kexec to see why kexec can not pass > > > whole 38? range to second kernel? > > > > Not getting any output so far, -d doesn't do much. Poking around in the > > source... > > OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to > kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges > total), that smells like just a kexec bug. Retesting -git... Current -git works fine when all the ranges are passed correctly. So, I think, the only existing regression is the SRAT issue. -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 21:01 ` Jens Axboe @ 2009-12-15 21:26 ` Yinghai Lu 2009-12-15 21:30 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 21:26 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying Jens Axboe wrote: > On Tue, Dec 15 2009, Jens Axboe wrote: >> On Tue, Dec 15 2009, Jens Axboe wrote: >>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>> Jens Axboe wrote: >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>> Jens Axboe wrote: >>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>>> Jens Axboe wrote: >>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix >>>>>>>>>> >>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it >>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the >>>>>>>>> mmconf problem to begin with, are we now just working around the issue? >>>>>>>>> SRAT still reports issues, numa doesn't work. >>>>>>>> that patch will be bullet proof... we need it. >>>>>>>> >>>>>>>> also still need to figure out why memmap range is not passed properly. >>>>>>>> >>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in >>>>>>>> second kernel? >>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT >>>>>>> complaints and NUMA works fine. >>>>>> do you need >>>>>> memmap=62G@4G >>>>>> in this case? >>>>> Yes, I've needed that always. >>>> good, >>>> >>>> can you enable debug option in kexec to see why kexec can not pass >>>> whole 38? range to second kernel? >>> Not getting any output so far, -d doesn't do much. Poking around in the >>> source... >> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to >> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges >> total), that smells like just a kexec bug. Retesting -git... > > Current -git works fine when all the ranges are passed correctly. So, I > think, the only existing regression is the SRAT issue. did you change node_shift? YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 21:26 ` Yinghai Lu @ 2009-12-15 21:30 ` Jens Axboe 2009-12-15 21:40 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 21:30 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Jens Axboe wrote: > >> On Tue, Dec 15 2009, Jens Axboe wrote: > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>> Jens Axboe wrote: > >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>>>> Jens Axboe wrote: > >>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>>>>>> Jens Axboe wrote: > >>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix > >>>>>>>>>> > >>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it > >>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the > >>>>>>>>> mmconf problem to begin with, are we now just working around the issue? > >>>>>>>>> SRAT still reports issues, numa doesn't work. > >>>>>>>> that patch will be bullet proof... we need it. > >>>>>>>> > >>>>>>>> also still need to figure out why memmap range is not passed properly. > >>>>>>>> > >>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in > >>>>>>>> second kernel? > >>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT > >>>>>>> complaints and NUMA works fine. > >>>>>> do you need > >>>>>> memmap=62G@4G > >>>>>> in this case? > >>>>> Yes, I've needed that always. > >>>> good, > >>>> > >>>> can you enable debug option in kexec to see why kexec can not pass > >>>> whole 38? range to second kernel? > >>> Not getting any output so far, -d doesn't do much. Poking around in the > >>> source... > >> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to > >> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges > >> total), that smells like just a kexec bug. Retesting -git... > > > > Current -git works fine when all the ranges are passed correctly. So, I > > think, the only existing regression is the SRAT issue. > > did you change node_shift? Yes: CONFIG_NODES_SHIFT=6 What I don't get is that 2.6.32 and -git print the same PXM map, and in both cases it's totalling exactly 64G. Yet it says: SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 21:30 ` Jens Axboe @ 2009-12-15 21:40 ` Jens Axboe 2009-12-15 21:43 ` Yinghai Lu 0 siblings, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 21:40 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying On Tue, Dec 15 2009, Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: > > Jens Axboe wrote: > > > On Tue, Dec 15 2009, Jens Axboe wrote: > > >> On Tue, Dec 15 2009, Jens Axboe wrote: > > >>> On Tue, Dec 15 2009, Yinghai Lu wrote: > > >>>> Jens Axboe wrote: > > >>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > > >>>>>> Jens Axboe wrote: > > >>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > > >>>>>>>> Jens Axboe wrote: > > >>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > > >>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix > > >>>>>>>>>> > > >>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it > > >>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the > > >>>>>>>>> mmconf problem to begin with, are we now just working around the issue? > > >>>>>>>>> SRAT still reports issues, numa doesn't work. > > >>>>>>>> that patch will be bullet proof... we need it. > > >>>>>>>> > > >>>>>>>> also still need to figure out why memmap range is not passed properly. > > >>>>>>>> > > >>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in > > >>>>>>>> second kernel? > > >>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT > > >>>>>>> complaints and NUMA works fine. > > >>>>>> do you need > > >>>>>> memmap=62G@4G > > >>>>>> in this case? > > >>>>> Yes, I've needed that always. > > >>>> good, > > >>>> > > >>>> can you enable debug option in kexec to see why kexec can not pass > > >>>> whole 38? range to second kernel? > > >>> Not getting any output so far, -d doesn't do much. Poking around in the > > >>> source... > > >> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to > > >> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges > > >> total), that smells like just a kexec bug. Retesting -git... > > > > > > Current -git works fine when all the ranges are passed correctly. So, I > > > think, the only existing regression is the SRAT issue. > > > > did you change node_shift? > > Yes: > > CONFIG_NODES_SHIFT=6 > > What I don't get is that 2.6.32 and -git print the same PXM map, and in > both cases it's totalling exactly 64G. Yet it says: > > SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. Clue: [ 0.000000] SRAT: Node 0 PXM 0 0-80000000 [ 0.000000] SRAT: Node 0 PXM 0 100000000-480000000 [ 0.000000] SRAT: Node 2 PXM 1 480000000-880000000 [ 0.000000] SRAT: Node 1 PXM 2 880000000-c80000000 [ 0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000 [ 0.000000] NUMA: Using 31 for the hash shift. [ 0.000000] pxm0: 0-480000 (4718592), absent 553990 [ 0.000000] pxm1: 880000-c80000 (4194304), absent 0 [ 0.000000] pxm2: 480000-880000 (4194304), absent 4194304 [ 0.000000] pxm3: c80000-1080000 (4194304), absent 0 [ 0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. [ 0.000000] SRAT: SRAT not used. It's essentially disregarding pxm2, claiming all pages are absent. -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 21:40 ` Jens Axboe @ 2009-12-15 21:43 ` Yinghai Lu 2009-12-15 21:47 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 21:43 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying [-- Attachment #1: Type: text/plain, Size: 2992 bytes --] Jens Axboe wrote: > On Tue, Dec 15 2009, Jens Axboe wrote: >> On Tue, Dec 15 2009, Yinghai Lu wrote: >>> Jens Axboe wrote: >>>> On Tue, Dec 15 2009, Jens Axboe wrote: >>>>> On Tue, Dec 15 2009, Jens Axboe wrote: >>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>> Jens Axboe wrote: >>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>>>> Jens Axboe wrote: >>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>>>>>> Jens Axboe wrote: >>>>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix >>>>>>>>>>>>> >>>>>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it >>>>>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the >>>>>>>>>>>> mmconf problem to begin with, are we now just working around the issue? >>>>>>>>>>>> SRAT still reports issues, numa doesn't work. >>>>>>>>>>> that patch will be bullet proof... we need it. >>>>>>>>>>> >>>>>>>>>>> also still need to figure out why memmap range is not passed properly. >>>>>>>>>>> >>>>>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in >>>>>>>>>>> second kernel? >>>>>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT >>>>>>>>>> complaints and NUMA works fine. >>>>>>>>> do you need >>>>>>>>> memmap=62G@4G >>>>>>>>> in this case? >>>>>>>> Yes, I've needed that always. >>>>>>> good, >>>>>>> >>>>>>> can you enable debug option in kexec to see why kexec can not pass >>>>>>> whole 38? range to second kernel? >>>>>> Not getting any output so far, -d doesn't do much. Poking around in the >>>>>> source... >>>>> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to >>>>> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges >>>>> total), that smells like just a kexec bug. Retesting -git... >>>> Current -git works fine when all the ranges are passed correctly. So, I >>>> think, the only existing regression is the SRAT issue. >>> did you change node_shift? >> Yes: >> >> CONFIG_NODES_SHIFT=6 >> >> What I don't get is that 2.6.32 and -git print the same PXM map, and in >> both cases it's totalling exactly 64G. Yet it says: >> >> SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. > > Clue: > > [ 0.000000] SRAT: Node 0 PXM 0 0-80000000 > [ 0.000000] SRAT: Node 0 PXM 0 100000000-480000000 > [ 0.000000] SRAT: Node 2 PXM 1 480000000-880000000 > [ 0.000000] SRAT: Node 1 PXM 2 880000000-c80000000 > [ 0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000 > [ 0.000000] NUMA: Using 31 for the hash shift. > [ 0.000000] pxm0: 0-480000 (4718592), absent 553990 > [ 0.000000] pxm1: 880000-c80000 (4194304), absent 0 > [ 0.000000] pxm2: 480000-880000 (4194304), absent 4194304 > [ 0.000000] pxm3: c80000-1080000 (4194304), absent 0 > [ 0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. > [ 0.000000] SRAT: SRAT not used. > oh, i post one patch last week, can you check it? YH [-- Attachment #2: Attached Message --] [-- Type: message/rfc822, Size: 5721 bytes --] From: Yinghai Lu <yinghai@kernel.org> To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>, "H. Peter Anvin" <hpa@zytor.com>, Andrew Morton <akpm@linux-foundation.org>, Mel Gorman <mel@csn.ul.ie>, Suresh Siddha <suresh.b.siddha@intel.com> Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org> Subject: [PATCH] x86: fix checking of SRAT when node0 ram is not from 0 -v2 Date: Sun, 13 Dec 2009 15:33:38 -0800 Message-ID: <4B2579D2.3010201@kernel.org> Found one system that boot from socket1 instead of socket0, SRAT get rejected... [ 0.000000] SRAT: Node 1 PXM 0 0-a0000 [ 0.000000] SRAT: Node 1 PXM 0 100000-80000000 [ 0.000000] SRAT: Node 1 PXM 0 100000000-2080000000 [ 0.000000] SRAT: Node 0 PXM 1 2080000000-4080000000 [ 0.000000] SRAT: Node 2 PXM 2 4080000000-6080000000 [ 0.000000] SRAT: Node 3 PXM 3 6080000000-8080000000 [ 0.000000] SRAT: Node 4 PXM 4 8080000000-a080000000 [ 0.000000] SRAT: Node 5 PXM 5 a080000000-c080000000 [ 0.000000] SRAT: Node 6 PXM 6 c080000000-e080000000 [ 0.000000] SRAT: Node 7 PXM 7 e080000000-10080000000 ... [ 0.000000] NUMA: Allocated memnodemap from 500000 - 701040 [ 0.000000] NUMA: Using 20 for the hash shift. [ 0.000000] Adding active range (0, 0x2080000, 0x4080000) 0 entries of 3200 used [ 0.000000] Adding active range (1, 0x0, 0x96) 1 entries of 3200 used [ 0.000000] Adding active range (1, 0x100, 0x7f750) 2 entries of 3200 used [ 0.000000] Adding active range (1, 0x100000, 0x2080000) 3 entries of 3200 used [ 0.000000] Adding active range (2, 0x4080000, 0x6080000) 4 entries of 3200 used [ 0.000000] Adding active range (3, 0x6080000, 0x8080000) 5 entries of 3200 used [ 0.000000] Adding active range (4, 0x8080000, 0xa080000) 6 entries of 3200 used [ 0.000000] Adding active range (5, 0xa080000, 0xc080000) 7 entries of 3200 used [ 0.000000] Adding active range (6, 0xc080000, 0xe080000) 8 entries of 3200 used [ 0.000000] Adding active range (7, 0xe080000, 0x10080000) 9 entries of 3200 used [ 0.000000] SRAT: PXMs only cover 917504MB of your 1048566MB e820 RAM. Not used. [ 0.000000] SRAT: SRAT not used. the early_node_map is not sorted because node0 with non zero start come first. so try to sort it right away after all regions are registered. -v2: make it more solid to handle cross node case like node0 [0,4g), [8,12g) and node1 [4g, 8g), [12g, 16g) Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- arch/x86/mm/srat_32.c | 2 ++ arch/x86/mm/srat_64.c | 4 +++- include/linux/mm.h | 3 +++ mm/page_alloc.c | 4 ++-- 4 files changed, 10 insertions(+), 3 deletions(-) Index: linux-2.6/arch/x86/mm/srat_32.c =================================================================== --- linux-2.6.orig/arch/x86/mm/srat_32.c +++ linux-2.6/arch/x86/mm/srat_32.c @@ -267,6 +267,8 @@ int __init get_memcfg_from_srat(void) e820_register_active_regions(chunk->nid, chunk->start_pfn, min(chunk->end_pfn, max_pfn)); } + /* for out of order entries in SRAT */ + sort_node_map(); for_each_online_node(nid) { unsigned long start = node_start_pfn[nid]; Index: linux-2.6/arch/x86/mm/srat_64.c =================================================================== --- linux-2.6.orig/arch/x86/mm/srat_64.c +++ linux-2.6/arch/x86/mm/srat_64.c @@ -317,7 +317,7 @@ static int __init nodes_cover_memory(con unsigned long s = nodes[i].start >> PAGE_SHIFT; unsigned long e = nodes[i].end >> PAGE_SHIFT; pxmram += e - s; - pxmram -= absent_pages_in_range(s, e); + pxmram -= __absent_pages_in_range(i, s, e); if ((long)pxmram < 0) pxmram = 0; } @@ -373,6 +373,8 @@ int __init acpi_scan_nodes(unsigned long for_each_node_mask(i, nodes_parsed) e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT, nodes[i].end >> PAGE_SHIFT); + /* for out of order entries in SRAT */ + sort_node_map(); if (!nodes_cover_memory(nodes)) { bad_srat(); return -1; Index: linux-2.6/include/linux/mm.h =================================================================== --- linux-2.6.orig/include/linux/mm.h +++ linux-2.6/include/linux/mm.h @@ -1022,6 +1022,9 @@ extern void add_active_range(unsigned in extern void remove_active_range(unsigned int nid, unsigned long start_pfn, unsigned long end_pfn); extern void remove_all_active_ranges(void); +void sort_node_map(void); +unsigned long __absent_pages_in_range(int nid, unsigned long start_pfn, + unsigned long end_pfn); extern unsigned long absent_pages_in_range(unsigned long start_pfn, unsigned long end_pfn); extern void get_pfn_range_for_nid(unsigned int nid, Index: linux-2.6/mm/page_alloc.c =================================================================== --- linux-2.6.orig/mm/page_alloc.c +++ linux-2.6/mm/page_alloc.c @@ -3573,7 +3573,7 @@ static unsigned long __meminit zone_span * Return the number of holes in a range on a node. If nid is MAX_NUMNODES, * then all holes in the requested range will be accounted for. */ -static unsigned long __meminit __absent_pages_in_range(int nid, +unsigned long __meminit __absent_pages_in_range(int nid, unsigned long range_start_pfn, unsigned long range_end_pfn) { @@ -4102,7 +4102,7 @@ static int __init cmp_node_active_region } /* sort the node_map by start_pfn */ -static void __init sort_node_map(void) +void __init sort_node_map(void) { sort(early_node_map, (size_t)nr_nodemap_entries, sizeof(struct node_active_region), ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 21:43 ` Yinghai Lu @ 2009-12-15 21:47 ` Jens Axboe 2009-12-15 21:50 ` Yinghai Lu 2009-12-15 21:52 ` Jens Axboe 0 siblings, 2 replies; 42+ messages in thread From: Jens Axboe @ 2009-12-15 21:47 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying, rientjes On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Jens Axboe wrote: > >> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>> Jens Axboe wrote: > >>>> On Tue, Dec 15 2009, Jens Axboe wrote: > >>>>> On Tue, Dec 15 2009, Jens Axboe wrote: > >>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>>>>> Jens Axboe wrote: > >>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>>>>>>> Jens Axboe wrote: > >>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>>>>>>>>> Jens Axboe wrote: > >>>>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: > >>>>>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix > >>>>>>>>>>>>> > >>>>>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it > >>>>>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the > >>>>>>>>>>>> mmconf problem to begin with, are we now just working around the issue? > >>>>>>>>>>>> SRAT still reports issues, numa doesn't work. > >>>>>>>>>>> that patch will be bullet proof... we need it. > >>>>>>>>>>> > >>>>>>>>>>> also still need to figure out why memmap range is not passed properly. > >>>>>>>>>>> > >>>>>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in > >>>>>>>>>>> second kernel? > >>>>>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT > >>>>>>>>>> complaints and NUMA works fine. > >>>>>>>>> do you need > >>>>>>>>> memmap=62G@4G > >>>>>>>>> in this case? > >>>>>>>> Yes, I've needed that always. > >>>>>>> good, > >>>>>>> > >>>>>>> can you enable debug option in kexec to see why kexec can not pass > >>>>>>> whole 38? range to second kernel? > >>>>>> Not getting any output so far, -d doesn't do much. Poking around in the > >>>>>> source... > >>>>> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to > >>>>> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges > >>>>> total), that smells like just a kexec bug. Retesting -git... > >>>> Current -git works fine when all the ranges are passed correctly. So, I > >>>> think, the only existing regression is the SRAT issue. > >>> did you change node_shift? > >> Yes: > >> > >> CONFIG_NODES_SHIFT=6 > >> > >> What I don't get is that 2.6.32 and -git print the same PXM map, and in > >> both cases it's totalling exactly 64G. Yet it says: > >> > >> SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. > > > > Clue: > > > > [ 0.000000] SRAT: Node 0 PXM 0 0-80000000 > > [ 0.000000] SRAT: Node 0 PXM 0 100000000-480000000 > > [ 0.000000] SRAT: Node 2 PXM 1 480000000-880000000 > > [ 0.000000] SRAT: Node 1 PXM 2 880000000-c80000000 > > [ 0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000 > > [ 0.000000] NUMA: Using 31 for the hash shift. > > [ 0.000000] pxm0: 0-480000 (4718592), absent 553990 > > [ 0.000000] pxm1: 880000-c80000 (4194304), absent 0 > > [ 0.000000] pxm2: 480000-880000 (4194304), absent 4194304 > > [ 0.000000] pxm3: c80000-1080000 (4194304), absent 0 > > [ 0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. > > [ 0.000000] SRAT: SRAT not used. > > > > oh, i post one patch last week, > > can you check it? Sure, let me try it. I already found out that commit 8716273c is the guilty one (x86: Export srat physical topology). -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 21:47 ` Jens Axboe @ 2009-12-15 21:50 ` Yinghai Lu 2009-12-15 21:52 ` Jens Axboe 1 sibling, 0 replies; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 21:50 UTC (permalink / raw) To: Jens Axboe Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying, rientjes Jens Axboe wrote: > On Tue, Dec 15 2009, Yinghai Lu wrote: >> Jens Axboe wrote: >>> On Tue, Dec 15 2009, Jens Axboe wrote: >>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>> Jens Axboe wrote: >>>>>> On Tue, Dec 15 2009, Jens Axboe wrote: >>>>>>> On Tue, Dec 15 2009, Jens Axboe wrote: >>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>>>> Jens Axboe wrote: >>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>>>>>> Jens Axboe wrote: >>>>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>>>>>>>> Jens Axboe wrote: >>>>>>>>>>>>>> On Tue, Dec 15 2009, Yinghai Lu wrote: >>>>>>>>>>>>>>> [PATCH] x86/pci: intel ioh bus num reg accessing fix >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> it is above 0x100, so if mmconf is not enable, need to skip it >>>>>>>>>>>>>> This works, it kexecs kernels fine. But since 2.6.32 doesn't have the >>>>>>>>>>>>>> mmconf problem to begin with, are we now just working around the issue? >>>>>>>>>>>>>> SRAT still reports issues, numa doesn't work. >>>>>>>>>>>>> that patch will be bullet proof... we need it. >>>>>>>>>>>>> >>>>>>>>>>>>> also still need to figure out why memmap range is not passed properly. >>>>>>>>>>>>> >>>>>>>>>>>>> do you mean 2.6.32 kexec 2.6.32 it have worked mmconf and numa in >>>>>>>>>>>>> second kernel? >>>>>>>>>>>> Yes, 2.6.32 booted and 2.6.32 kexec'ed works just fine, no SRAT >>>>>>>>>>>> complaints and NUMA works fine. >>>>>>>>>>> do you need >>>>>>>>>>> memmap=62G@4G >>>>>>>>>>> in this case? >>>>>>>>>> Yes, I've needed that always. >>>>>>>>> good, >>>>>>>>> >>>>>>>>> can you enable debug option in kexec to see why kexec can not pass >>>>>>>>> whole 38? range to second kernel? >>>>>>>> Not getting any output so far, -d doesn't do much. Poking around in the >>>>>>>> source... >>>>>>> OK, cold boot and kexec 2.0.1 gets all 39 ranges passed properly to >>>>>>> kexec'ed kernels. Since the older kexec stopped at range 30 (31 ranges >>>>>>> total), that smells like just a kexec bug. Retesting -git... >>>>>> Current -git works fine when all the ranges are passed correctly. So, I >>>>>> think, the only existing regression is the SRAT issue. >>>>> did you change node_shift? >>>> Yes: >>>> >>>> CONFIG_NODES_SHIFT=6 >>>> >>>> What I don't get is that 2.6.32 and -git print the same PXM map, and in >>>> both cases it's totalling exactly 64G. Yet it says: >>>> >>>> SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. >>> Clue: >>> >>> [ 0.000000] SRAT: Node 0 PXM 0 0-80000000 >>> [ 0.000000] SRAT: Node 0 PXM 0 100000000-480000000 >>> [ 0.000000] SRAT: Node 2 PXM 1 480000000-880000000 >>> [ 0.000000] SRAT: Node 1 PXM 2 880000000-c80000000 >>> [ 0.000000] SRAT: Node 3 PXM 3 c80000000-1080000000 >>> [ 0.000000] NUMA: Using 31 for the hash shift. >>> [ 0.000000] pxm0: 0-480000 (4718592), absent 553990 >>> [ 0.000000] pxm1: 880000-c80000 (4194304), absent 0 >>> [ 0.000000] pxm2: 480000-880000 (4194304), absent 4194304 >>> [ 0.000000] pxm3: c80000-1080000 (4194304), absent 0 >>> [ 0.000000] SRAT: PXMs only cover 49035MB of your 65419MB e820 RAM. Not used. >>> [ 0.000000] SRAT: SRAT not used. >>> >> oh, i post one patch last week, >> >> can you check it? > > Sure, let me try it. I already found out that commit 8716273c is the > guilty one (x86: Export srat physical topology). ok, my patch should fix that. YH ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 21:47 ` Jens Axboe 2009-12-15 21:50 ` Yinghai Lu @ 2009-12-15 21:52 ` Jens Axboe 2009-12-15 22:24 ` Yinghai Lu 1 sibling, 1 reply; 42+ messages in thread From: Jens Axboe @ 2009-12-15 21:52 UTC (permalink / raw) To: Yinghai Lu Cc: Jesse Barnes, Linux Kernel, mingo, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, H. Peter Anvin, Huang Ying, rientjes On Tue, Dec 15 2009, Jens Axboe wrote: > > oh, i post one patch last week, > > > > can you check it? > > Sure, let me try it. I already found out that commit 8716273c is the > guilty one (x86: Export srat physical topology). Confirmed, -git with that patch works as well. So that's all of them I think, can we please get this expedited in so that -rc1 will work? Thanks! -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 21:52 ` Jens Axboe @ 2009-12-15 22:24 ` Yinghai Lu 2009-12-16 10:01 ` Jens Axboe 0 siblings, 1 reply; 42+ messages in thread From: Yinghai Lu @ 2009-12-15 22:24 UTC (permalink / raw) To: mingo, H. Peter Anvin, Thomas Gleixner Cc: Jens Axboe, Jesse Barnes, Linux Kernel, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, Huang Ying, rientjes Jens Axboe wrote: > On Tue, Dec 15 2009, Jens Axboe wrote: >>> oh, i post one patch last week, >>> >>> can you check it? >> Sure, let me try it. I already found out that commit 8716273c is the >> guilty one (x86: Export srat physical topology). > > Confirmed, -git with that patch works as well. So that's all of them I > think, can we please get this expedited in so that -rc1 will work? > Thanks! updated version: [PATCH] x86: fix checking of SRAT when node0 ram is not from 0 -v3 Found one system that boot from socket1 instead of socket0, SRAT get rejected... [ 0.000000] SRAT: Node 1 PXM 0 0-a0000 [ 0.000000] SRAT: Node 1 PXM 0 100000-80000000 [ 0.000000] SRAT: Node 1 PXM 0 100000000-2080000000 [ 0.000000] SRAT: Node 0 PXM 1 2080000000-4080000000 [ 0.000000] SRAT: Node 2 PXM 2 4080000000-6080000000 [ 0.000000] SRAT: Node 3 PXM 3 6080000000-8080000000 [ 0.000000] SRAT: Node 4 PXM 4 8080000000-a080000000 [ 0.000000] SRAT: Node 5 PXM 5 a080000000-c080000000 [ 0.000000] SRAT: Node 6 PXM 6 c080000000-e080000000 [ 0.000000] SRAT: Node 7 PXM 7 e080000000-10080000000 ... [ 0.000000] NUMA: Allocated memnodemap from 500000 - 701040 [ 0.000000] NUMA: Using 20 for the hash shift. [ 0.000000] Adding active range (0, 0x2080000, 0x4080000) 0 entries of 3200 used [ 0.000000] Adding active range (1, 0x0, 0x96) 1 entries of 3200 used [ 0.000000] Adding active range (1, 0x100, 0x7f750) 2 entries of 3200 used [ 0.000000] Adding active range (1, 0x100000, 0x2080000) 3 entries of 3200 used [ 0.000000] Adding active range (2, 0x4080000, 0x6080000) 4 entries of 3200 used [ 0.000000] Adding active range (3, 0x6080000, 0x8080000) 5 entries of 3200 used [ 0.000000] Adding active range (4, 0x8080000, 0xa080000) 6 entries of 3200 used [ 0.000000] Adding active range (5, 0xa080000, 0xc080000) 7 entries of 3200 used [ 0.000000] Adding active range (6, 0xc080000, 0xe080000) 8 entries of 3200 used [ 0.000000] Adding active range (7, 0xe080000, 0x10080000) 9 entries of 3200 used [ 0.000000] SRAT: PXMs only cover 917504MB of your 1048566MB e820 RAM. Not used. [ 0.000000] SRAT: SRAT not used. the early_node_map is not sorted because node0 with non zero start come first. so try to sort it right away after all regions are registered. also fixs refression by 8716273c (x86: Export srat physical topology) -v2: make it more solid to handle cross node case like node0 [0,4g), [8,12g) and node1 [4g, 8g), [12g, 16g) -v3: update comments. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Tested-by: Jens Axboe <jens.axboe@oracle.com> --- arch/x86/mm/srat_32.c | 2 ++ arch/x86/mm/srat_64.c | 4 +++- include/linux/mm.h | 3 +++ mm/page_alloc.c | 4 ++-- 4 files changed, 10 insertions(+), 3 deletions(-) Index: linux-2.6/arch/x86/mm/srat_32.c =================================================================== --- linux-2.6.orig/arch/x86/mm/srat_32.c +++ linux-2.6/arch/x86/mm/srat_32.c @@ -267,6 +267,8 @@ int __init get_memcfg_from_srat(void) e820_register_active_regions(chunk->nid, chunk->start_pfn, min(chunk->end_pfn, max_pfn)); } + /* for out of order entries in SRAT */ + sort_node_map(); for_each_online_node(nid) { unsigned long start = node_start_pfn[nid]; Index: linux-2.6/arch/x86/mm/srat_64.c =================================================================== --- linux-2.6.orig/arch/x86/mm/srat_64.c +++ linux-2.6/arch/x86/mm/srat_64.c @@ -317,7 +317,7 @@ static int __init nodes_cover_memory(con unsigned long s = nodes[i].start >> PAGE_SHIFT; unsigned long e = nodes[i].end >> PAGE_SHIFT; pxmram += e - s; - pxmram -= absent_pages_in_range(s, e); + pxmram -= __absent_pages_in_range(i, s, e); if ((long)pxmram < 0) pxmram = 0; } @@ -373,6 +373,8 @@ int __init acpi_scan_nodes(unsigned long for_each_node_mask(i, nodes_parsed) e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT, nodes[i].end >> PAGE_SHIFT); + /* for out of order entries in SRAT */ + sort_node_map(); if (!nodes_cover_memory(nodes)) { bad_srat(); return -1; Index: linux-2.6/include/linux/mm.h =================================================================== --- linux-2.6.orig/include/linux/mm.h +++ linux-2.6/include/linux/mm.h @@ -1037,6 +1037,9 @@ extern void add_active_range(unsigned in extern void remove_active_range(unsigned int nid, unsigned long start_pfn, unsigned long end_pfn); extern void remove_all_active_ranges(void); +void sort_node_map(void); +unsigned long __absent_pages_in_range(int nid, unsigned long start_pfn, + unsigned long end_pfn); extern unsigned long absent_pages_in_range(unsigned long start_pfn, unsigned long end_pfn); extern void get_pfn_range_for_nid(unsigned int nid, Index: linux-2.6/mm/page_alloc.c =================================================================== --- linux-2.6.orig/mm/page_alloc.c +++ linux-2.6/mm/page_alloc.c @@ -3569,7 +3569,7 @@ static unsigned long __meminit zone_span * Return the number of holes in a range on a node. If nid is MAX_NUMNODES, * then all holes in the requested range will be accounted for. */ -static unsigned long __meminit __absent_pages_in_range(int nid, +unsigned long __meminit __absent_pages_in_range(int nid, unsigned long range_start_pfn, unsigned long range_end_pfn) { @@ -4098,7 +4098,7 @@ static int __init cmp_node_active_region } /* sort the node_map by start_pfn */ -static void __init sort_node_map(void) +void __init sort_node_map(void) { sort(early_node_map, (size_t)nr_nodemap_entries, sizeof(struct node_active_region), ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: kexec boot regression 2009-12-15 22:24 ` Yinghai Lu @ 2009-12-16 10:01 ` Jens Axboe 0 siblings, 0 replies; 42+ messages in thread From: Jens Axboe @ 2009-12-16 10:01 UTC (permalink / raw) To: Yinghai Lu Cc: mingo, H. Peter Anvin, Thomas Gleixner, Jesse Barnes, Linux Kernel, rdreier, Suresh Siddha, linux-pci@vger.kernel.org, Huang Ying, rientjes On Tue, Dec 15 2009, Yinghai Lu wrote: > Jens Axboe wrote: > > On Tue, Dec 15 2009, Jens Axboe wrote: > >>> oh, i post one patch last week, > >>> > >>> can you check it? > >> Sure, let me try it. I already found out that commit 8716273c is the > >> guilty one (x86: Export srat physical topology). > > > > Confirmed, -git with that patch works as well. So that's all of them I > > think, can we please get this expedited in so that -rc1 will work? > > Thanks! > > updated version: > > [PATCH] x86: fix checking of SRAT when node0 ram is not from 0 -v3 Verified, this one works fine, too. -- Jens Axboe ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2009-12-16 10:01 UTC | newest] Thread overview: 42+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-15 11:50 kexec boot regression Jens Axboe 2009-12-15 12:01 ` Yinghai Lu 2009-12-15 12:14 ` Jens Axboe 2009-12-15 12:31 ` Yinghai Lu 2009-12-15 12:39 ` Jens Axboe 2009-12-15 12:55 ` Yinghai Lu 2009-12-15 14:11 ` Jens Axboe 2009-12-15 18:39 ` Yinghai Lu 2009-12-15 18:47 ` Matthew Wilcox 2009-12-15 18:54 ` Jens Axboe 2009-12-15 18:59 ` Jens Axboe 2009-12-15 19:04 ` Yinghai Lu 2009-12-15 19:11 ` Jens Axboe 2009-12-15 19:17 ` Yinghai Lu 2009-12-15 19:22 ` Jens Axboe 2009-12-15 19:28 ` Jens Axboe 2009-12-15 19:44 ` Yinghai Lu 2009-12-15 19:48 ` Jens Axboe 2009-12-15 19:49 ` Yinghai Lu 2009-12-15 19:57 ` Jens Axboe 2009-12-15 21:30 ` Markus Trippelsdorf 2009-12-15 23:02 ` kexec boot regression radeon/kms (bisected) Markus Trippelsdorf 2009-12-15 19:43 ` kexec boot regression Jens Axboe 2009-12-15 19:48 ` Yinghai Lu 2009-12-15 19:51 ` Jens Axboe 2009-12-15 19:56 ` Yinghai Lu 2009-12-15 20:09 ` Jens Axboe 2009-12-15 20:14 ` Yinghai Lu 2009-12-15 20:19 ` Jens Axboe 2009-12-15 20:21 ` Yinghai Lu 2009-12-15 20:42 ` Jens Axboe 2009-12-15 20:55 ` Jens Axboe 2009-12-15 21:01 ` Jens Axboe 2009-12-15 21:26 ` Yinghai Lu 2009-12-15 21:30 ` Jens Axboe 2009-12-15 21:40 ` Jens Axboe 2009-12-15 21:43 ` Yinghai Lu 2009-12-15 21:47 ` Jens Axboe 2009-12-15 21:50 ` Yinghai Lu 2009-12-15 21:52 ` Jens Axboe 2009-12-15 22:24 ` Yinghai Lu 2009-12-16 10:01 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox