* kdump: quad core Opteron
@ 2008-10-07 12:51 Chandru
2008-10-07 13:06 ` Bernhard Walle
2008-10-07 13:24 ` Vivek Goyal
0 siblings, 2 replies; 13+ messages in thread
From: Chandru @ 2008-10-07 12:51 UTC (permalink / raw)
To: kexec; +Cc: Vivek Goyal
kdump on a quad core Opteron blade machine doesn't give a complete
vmcore on the system. All works well until we attempt to copy
/proc/vmcore to some target place ( disk , n/w ). The system immediately
resets without any OS messages after having copied few mb's of vmcore
file. Problem also occurs with 2.6.27-rc8 and latest kexec-tools. If
we pass 'mem=4G' as boot parameter to the first kernel, then kdump
succeeds in copying a readable vmcore to /var/crash.
[root@abc ~]# uname -a
Linux abc 2.6.27-rc8 #2 SMP Tue Oct 7 08:05:46 EDT 2008 x86_64 x86_64
x86_64 GNU/Linux
Chandru
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-10-07 12:51 kdump: quad core Opteron Chandru
@ 2008-10-07 13:06 ` Bernhard Walle
2008-10-07 13:24 ` Vivek Goyal
1 sibling, 0 replies; 13+ messages in thread
From: Bernhard Walle @ 2008-10-07 13:06 UTC (permalink / raw)
To: kexec
* Chandru [2008-10-07 18:21]:
>
> kdump on a quad core Opteron blade machine doesn't give a complete
> vmcore on the system. All works well until we attempt to copy
> /proc/vmcore to some target place ( disk , n/w ). The system immediately
> resets without any OS messages after having copied few mb's of vmcore
> file. Problem also occurs with 2.6.27-rc8 and latest kexec-tools. If
> we pass 'mem=4G' as boot parameter to the first kernel, then kdump
> succeeds in copying a readable vmcore to /var/crash.
Maybe extend the copy process to print out the address, and then look
at which address it crashes. Then look in the memory map, maybe there's
something special there.
Regards,
Bernhard
--
Bernhard Walle, SUSE Linux Products GmbH, Architecture Development
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-10-07 12:51 kdump: quad core Opteron Chandru
2008-10-07 13:06 ` Bernhard Walle
@ 2008-10-07 13:24 ` Vivek Goyal
2008-10-07 15:59 ` Bob Montgomery
2008-10-08 13:40 ` Chandru
1 sibling, 2 replies; 13+ messages in thread
From: Vivek Goyal @ 2008-10-07 13:24 UTC (permalink / raw)
To: Chandru; +Cc: kexec
On Tue, Oct 07, 2008 at 06:21:52PM +0530, Chandru wrote:
> kdump on a quad core Opteron blade machine doesn't give a complete
> vmcore on the system. All works well until we attempt to copy
> /proc/vmcore to some target place ( disk , n/w ). The system immediately
> resets without any OS messages after having copied few mb's of vmcore
> file. Problem also occurs with 2.6.27-rc8 and latest kexec-tools. If
> we pass 'mem=4G' as boot parameter to the first kernel, then kdump
> succeeds in copying a readable vmcore to /var/crash.
>
Hi Chandru,
How much memory this system has got. Can you also paste the output of
/proc/iomem of first kernel.
Does this system has GART? So looks like we are accessing some memory area
which platform does not like. (We saw issues with GART in the past.)
Can you also provide /proc/vmcore ELF header (readelf output), in both
the cases (mem=4G and without that).
You can try putting some printk in /proc/vmcore code and see which
physical memory area you are accessing when system goes bust. If in all
the failure cases it is same physical memory area, then we can try to find
what's so special about it.
Thanks
Vivek
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-10-07 13:24 ` Vivek Goyal
@ 2008-10-07 15:59 ` Bob Montgomery
2008-10-08 13:51 ` Chandru
2008-12-08 15:56 ` Chandru
2008-10-08 13:40 ` Chandru
1 sibling, 2 replies; 13+ messages in thread
From: Bob Montgomery @ 2008-10-07 15:59 UTC (permalink / raw)
To: Vivek Goyal; +Cc: Chandru, kexec@lists.infradead.org
[-- Attachment #1: Type: text/plain, Size: 2719 bytes --]
On Tue, 2008-10-07 at 13:24 +0000, Vivek Goyal wrote:
> On Tue, Oct 07, 2008 at 06:21:52PM +0530, Chandru wrote:
> > kdump on a quad core Opteron blade machine doesn't give a complete
> > vmcore on the system. All works well until we attempt to copy
> > /proc/vmcore to some target place ( disk , n/w ). The system immediately
> > resets without any OS messages after having copied few mb's of vmcore
> > file. Problem also occurs with 2.6.27-rc8 and latest kexec-tools. If
> > we pass 'mem=4G' as boot parameter to the first kernel, then kdump
> > succeeds in copying a readable vmcore to /var/crash.
> >
>
> Hi Chandru,
>
> How much memory this system has got. Can you also paste the output of
> /proc/iomem of first kernel.
>
> Does this system has GART? So looks like we are accessing some memory area
> which platform does not like. (We saw issues with GART in the past.)
>
> Can you also provide /proc/vmcore ELF header (readelf output), in both
> the cases (mem=4G and without that).
>
> You can try putting some printk in /proc/vmcore code and see which
> physical memory area you are accessing when system goes bust. If in all
> the failure cases it is same physical memory area, then we can try to find
> what's so special about it.
Or you can assume this is pretty much exactly the problem I ran into in
August. I've attached the patch that I'm using with our 2.6.18 kernel
to disable CPU-side access by the GART, which prevents the problem on
our Family 10H systems. You'll need to fix the directory name for
kernels newer than the arch/x86_64 merge.
Now that someone else has seen the problem, if this fixes it, I'll
submit the patch upstream.
Here's the README for the patch:
This patch changes the initialization of the GART (in
pci-gart.c:init_k8_gatt) to set the DisGartCpu bit in the GART Aperture
Control Register. Setting the bit Disables requests from the CPUs from
accessing the GART. In other words, CPU memory accesses within the
range of addresses in the aperture will not cause the GART to perform an
address translation. The aperture area was already being unmapped at
the kernel level with clear_kernel_mapping() to prevent accesses from
the CPU, but that kernel level unmapping is not in effect in the kexec'd
kdump kernel. By disabling the CPU-side accesses within the GART, which
does persist through the kexec of the kdump kernel, the kdump kernel is
prevented from interacting with the GART during accesses to the dump
memory areas which include the address range of the GART aperture.
Although the patch can be applied to the kdump kernel, it is not
exercised there because the kdump kernel doesn't attempt to initialize
the GART.
Bob Montgomery
working at HP
[-- Attachment #2: gart.cpuside.patch --]
[-- Type: text/x-patch, Size: 557 bytes --]
--- k-2.6.18-6-clim-amd64-bobm/arch/x86_64/kernel/pci-gart.c.orig 2008-08-13 08:53:33.000000000 -0600
+++ k-2.6.18-6-clim-amd64-bobm/arch/x86_64/kernel/pci-gart.c 2008-08-13 08:55:48.000000000 -0600
@@ -540,8 +540,9 @@ static __init int init_k8_gatt(struct ag
pci_write_config_dword(dev, 0x98, gatt_reg);
pci_read_config_dword(dev, 0x90, &ctl);
- ctl |= 1;
- ctl &= ~((1<<4) | (1<<5));
+ ctl |= 1; /* set GartEn */
+ ctl |= (1<<4); /* set DisGartCpu */
+ ctl &= ~(1<<5); /* clear DisGartIO */
pci_write_config_dword(dev, 0x90, ctl);
}
[-- Attachment #3: Type: text/plain, Size: 143 bytes --]
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-10-07 13:24 ` Vivek Goyal
2008-10-07 15:59 ` Bob Montgomery
@ 2008-10-08 13:40 ` Chandru
1 sibling, 0 replies; 13+ messages in thread
From: Chandru @ 2008-10-08 13:40 UTC (permalink / raw)
To: Vivek Goyal; +Cc: kexec
Vivek Goyal wrote:
> Hi Chandru,
>
> How much memory this system has got. Can you also paste the output of
> /proc/iomem of first kernel.
>
> Does this system has GART? So looks like we are accessing some memory area
> which platform does not like. (We saw issues with GART in the past.)
>
The system has 8GB of RAM. /proc/iomem shows the following without
mem=4G boot parameter
[root@abc]# cat /proc/iomem
00000000-0009afff : System RAM
0009b000-0009ffff : reserved
000e0000-000fffff : reserved
00100000-cff9f6ff : System RAM
00200000-0048adc9 : Kernel code
0048adca-005ee18f : Kernel data
0076d000-00823a4b : Kernel bss
02000000-11ffffff : Crash kernel
20000000-23ffffff : GART
cff9f700-cffa6fff : ACPI Tables
cffa7000-cfffffff : reserved
d4000000-d41fffff : PCI Bus 0000:01
d4000000-d41fffff : PCI Bus 0000:02
d4000000-d41fffff : PCI Bus 0000:03
d4000000-d41fffff : 0000:03:00.0
d4200000-d421ffff : 0000:00:05.0
d6000000-d6ffffff : PCI Bus 0000:0a
d7000000-d7ffffff : PCI Bus 0000:09
d8000000-d8ffffff : PCI Bus 0000:08
d9000000-d9ffffff : PCI Bus 0000:07
db000000-dcffffff : PCI Bus 0000:05
dc000000-dcffffff : PCI Bus 0000:06
de000000-e70fffff : PCI Bus 0000:01
de000000-e50fffff : PCI Bus 0000:02
de000000-e2ffffff : PCI Bus 0000:04
de000000-dfffffff : 0000:04:00.0
de000000-dfffffff : bnx2
e0000000-e1ffffff : 0000:04:00.1
e0000000-e1ffffff : bnx2
e4000000-e50fffff : PCI Bus 0000:03
e5000000-e500ffff : 0000:03:00.0
e5000000-e500ffff : mpt
e5010000-e5013fff : 0000:03:00.0
e5010000-e5013fff : mpt
e7000000-e701ffff : 0000:01:00.0
e8000000-efffffff : 0000:00:05.0
f3fed000-f3fedfff : 0000:00:0f.2
f3fed000-f3fedfff : ehci_hcd
f3fee000-f3feefff : 0000:00:0f.1
f3fee000-f3feefff : ohci_hcd
f3fef000-f3feffff : 0000:00:0f.0
f3fef000-f3feffff : ohci_hcd
f3ff0000-f3ffffff : 0000:00:05.0
f4000000-fbffffff : reserved
fa000000-faafffff : PCI MMCONFIG 0
fec00000-ffffffff : reserved
fec00000-fec00fff : IOAPIC 0
fec02000-fec02fff : IOAPIC 1
fed00000-fed003ff : HPET 0
fee00000-fee00fff : Local APIC
100000000-22fffffff : System RAM
With mem=4G /proc/iomem is as follows. The GART memory range seems to be
missing here
00000000-0009afff : System RAM
0009b000-0009ffff : reserved
000e0000-000fffff : reserved
00100000-cff9f6ff : System RAM
00200000-0048adc9 : Kernel code
0048adca-005ee18f : Kernel data
0076d000-00823a4b : Kernel bss
02000000-11ffffff : Crash kernel
cff9f700-cffa6fff : ACPI Tables
cffa7000-cfffffff : reserved
d4000000-d41fffff : PCI Bus 0000:01
d4000000-d41fffff : PCI Bus 0000:02
d4000000-d41fffff : PCI Bus 0000:03
d4000000-d41fffff : 0000:03:00.0
d4200000-d421ffff : 0000:00:05.0
d6000000-d6ffffff : PCI Bus 0000:0a
d7000000-d7ffffff : PCI Bus 0000:09
d8000000-d8ffffff : PCI Bus 0000:08
d9000000-d9ffffff : PCI Bus 0000:07
db000000-dcffffff : PCI Bus 0000:05
dc000000-dcffffff : PCI Bus 0000:06
de000000-e70fffff : PCI Bus 0000:01
de000000-e50fffff : PCI Bus 0000:02
de000000-e2ffffff : PCI Bus 0000:04
de000000-dfffffff : 0000:04:00.0
de000000-dfffffff : bnx2
e0000000-e1ffffff : 0000:04:00.1
e0000000-e1ffffff : bnx2
e4000000-e50fffff : PCI Bus 0000:03
e5000000-e500ffff : 0000:03:00.0
e5000000-e500ffff : mpt
e5010000-e5013fff : 0000:03:00.0
e5010000-e5013fff : mpt
e7000000-e701ffff : 0000:01:00.0
e8000000-efffffff : 0000:00:05.0
f3fed000-f3fedfff : 0000:00:0f.2
f3fed000-f3fedfff : ehci_hcd
f3fee000-f3feefff : 0000:00:0f.1
f3fee000-f3feefff : ohci_hcd
f3fef000-f3feffff : 0000:00:0f.0
f3fef000-f3feffff : ohci_hcd
f3ff0000-f3ffffff : 0000:00:05.0
f4000000-fbffffff : reserved
fa000000-faafffff : PCI MMCONFIG 0
fec00000-ffffffff : reserved
fec00000-fec00fff : IOAPIC 0
fec02000-fec02fff : IOAPIC 1
fed00000-fed003ff : HPET 0
fee00000-fee00fff : Local APIC
> Can you also provide /proc/vmcore ELF header (readelf output), in both
> the cases (mem=4G and without that).
>
ELF header with mem=4G
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: CORE (Core file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 64 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 5
Size of section headers: 0 (bytes)
Number of section headers: 0
Section header string table index: 0
There are no sections in this file.
There are no sections in this file.
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
NOTE 0x0000000000000158 0x0000000000000000 0x0000000000000000
0x0000000000000b20 0x0000000000000b20 0
LOAD 0x0000000000000c78 0xffffffff80200000 0x0000000000200000
0x0000000000624000 0x0000000000624000 RWE 0
LOAD 0x0000000000624c78 0xffff810000000000 0x0000000000000000
0x00000000000a0000 0x00000000000a0000 RWE 0
LOAD 0x00000000006c4c78 0xffff810000100000 0x0000000000100000
0x0000000001f00000 0x0000000001f00000 RWE 0
LOAD 0x00000000025c4c78 0xffff810012000000 0x0000000012000000
0x00000000bdf9f700 0x00000000bdf9f700 RWE 0
There is no dynamic section in this file.
There are no relocations in this file.
There are no unwind sections in this file.
No version information found in this file.
Notes at offset 0x00000158 with length 0x00000b20:
Owner Data size Description
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
------------------------------------------------------------------------------------
ELF header without mem=4G
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: CORE (Core file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 64 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 6
Size of section headers: 0 (bytes)
Number of section headers: 0
Section header string table index: 0
There are no sections in this file.
There are no sections in this file.
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
NOTE 0x0000000000000190 0x0000000000000000 0x0000000000000000
0x0000000000000b20 0x0000000000000b20 0
LOAD 0x0000000000000cb0 0xffffffff80200000 0x0000000000200000
0x0000000000624000 0x0000000000624000 RWE 0
LOAD 0x0000000000624cb0 0xffff810000000000 0x0000000000000000
0x00000000000a0000 0x00000000000a0000 RWE 0
LOAD 0x00000000006c4cb0 0xffff810000100000 0x0000000000100000
0x0000000001f00000 0x0000000001f00000 RWE 0
LOAD 0x00000000025c4cb0 0xffff810012000000 0x0000000012000000
0x00000000bdf9f700 0x00000000bdf9f700 RWE 0
LOAD 0x00000000c05643b0 0xffff810100000000 0x0000000100000000
0x0000000130000000 0x0000000130000000 RWE 0
There is no dynamic section in this file.
There are no relocations in this file.
There are no unwind sections in this file.
No version information found in this file.
Notes at offset 0x00000190 with length 0x00000b20:
Owner Data size Description
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
> You can try putting some printk in /proc/vmcore code and see which
> physical memory area you are accessing when system goes bust. If in all
> the failure cases it is same physical memory area, then we can try to find
> what's so special about it.
> Thanks
> Vivek
>
The vmcore-incomplete files are of different sizes at different runs (
18M, 32M.. ) and in case of n/w copy we get ( 190M, 198M ).
I tried with the patch priovided by Bob Montgomery and it seems like it
is working on this machine.
Thanks,
Chandru
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-10-07 15:59 ` Bob Montgomery
@ 2008-10-08 13:51 ` Chandru
2008-12-08 15:56 ` Chandru
1 sibling, 0 replies; 13+ messages in thread
From: Chandru @ 2008-10-08 13:51 UTC (permalink / raw)
To: bob.montgomery; +Cc: kexec@lists.infradead.org, Vivek Goyal
Bob Montgomery wrote:
> Here's the README for the patch:
>
> This patch changes the initialization of the GART (in
> pci-gart.c:init_k8_gatt) to set the DisGartCpu bit in the GART Aperture
> Control Register. Setting the bit Disables requests from the CPUs from
> accessing the GART. In other words, CPU memory accesses within the
> range of addresses in the aperture will not cause the GART to perform an
> address translation. The aperture area was already being unmapped at
> the kernel level with clear_kernel_mapping() to prevent accesses from
> the CPU, but that kernel level unmapping is not in effect in the kexec'd
> kdump kernel. By disabling the CPU-side accesses within the GART, which
> does persist through the kexec of the kdump kernel, the kdump kernel is
> prevented from interacting with the GART during accesses to the dump
> memory areas which include the address range of the GART aperture.
> Although the patch can be applied to the kdump kernel, it is not
> exercised there because the kdump kernel doesn't attempt to initialize
> the GART.
>
> Bob Montgomery
> working at HP
>
>
I made the following changes and it seems like it is working.
--- linux-2.6.27-rc8/include/asm-x86/gart.h.orig 2008-10-08
09:41:23.000000000 -0400
+++ linux-2.6.27-rc8/include/asm-x86/gart.h 2008-10-08
09:44:35.000000000 -0400
@@ -42,7 +42,8 @@ static inline void enable_gart_translati
/* Enable GART translation for this hammer. */
pci_read_config_dword(dev, AMD64_GARTAPERTURECTL, &ctl);
ctl |= GARTEN;
- ctl &= ~(DISGARTCPU | DISGARTIO);
+ ctl |= DISGARTCPU;
+ ctl &= ~(DISGARTIO);
pci_write_config_dword(dev, AMD64_GARTAPERTURECTL, ctl);
}
The system doesn't reset now while copying /proc/vmcore. Thanks!
Chandru
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-10-07 15:59 ` Bob Montgomery
2008-10-08 13:51 ` Chandru
@ 2008-12-08 15:56 ` Chandru
2008-12-08 21:54 ` Vivek Goyal
2008-12-08 23:35 ` Bob Montgomery
1 sibling, 2 replies; 13+ messages in thread
From: Chandru @ 2008-12-08 15:56 UTC (permalink / raw)
To: kexec, bob.montgomery; +Cc: Vivek Goyal
On Tuesday 07 October 2008 21:29:51 Bob Montgomery wrote:
> On Tue, 2008-10-07 at 13:24 +0000, Vivek Goyal wrote:
> > On Tue, Oct 07, 2008 at 06:21:52PM +0530, Chandru wrote:
> > > kdump on a quad core Opteron blade machine doesn't give a complete
> > > vmcore on the system. All works well until we attempt to copy
> > > /proc/vmcore to some target place ( disk , n/w ). The system
> > > immediately resets without any OS messages after having copied few mb's
> > > of vmcore file. Problem also occurs with 2.6.27-rc8 and latest
> > > kexec-tools. If we pass 'mem=4G' as boot parameter to the first
> > > kernel, then kdump succeeds in copying a readable vmcore to /var/crash.
> >
> > Hi Chandru,
> >
> > How much memory this system has got. Can you also paste the output of
> > /proc/iomem of first kernel.
> >
> > Does this system has GART? So looks like we are accessing some memory
> > area which platform does not like. (We saw issues with GART in the past.)
> >
> > Can you also provide /proc/vmcore ELF header (readelf output), in both
> > the cases (mem=4G and without that).
> >
> > You can try putting some printk in /proc/vmcore code and see which
> > physical memory area you are accessing when system goes bust. If in all
> > the failure cases it is same physical memory area, then we can try to
> > find what's so special about it.
>
> Or you can assume this is pretty much exactly the problem I ran into in
> August. I've attached the patch that I'm using with our 2.6.18 kernel
> to disable CPU-side access by the GART, which prevents the problem on
> our Family 10H systems. You'll need to fix the directory name for
> kernels newer than the arch/x86_64 merge.
>
> Now that someone else has seen the problem, if this fixes it, I'll
> submit the patch upstream.
>
> Here's the README for the patch:
>
> This patch changes the initialization of the GART (in
> pci-gart.c:init_k8_gatt) to set the DisGartCpu bit in the GART Aperture
> Control Register. Setting the bit Disables requests from the CPUs from
> accessing the GART. In other words, CPU memory accesses within the
> range of addresses in the aperture will not cause the GART to perform an
> address translation. The aperture area was already being unmapped at
> the kernel level with clear_kernel_mapping() to prevent accesses from
> the CPU, but that kernel level unmapping is not in effect in the kexec'd
> kdump kernel. By disabling the CPU-side accesses within the GART, which
> does persist through the kexec of the kdump kernel, the kdump kernel is
> prevented from interacting with the GART during accesses to the dump
> memory areas which include the address range of the GART aperture.
> Although the patch can be applied to the kdump kernel, it is not
> exercised there because the kdump kernel doesn't attempt to initialize
> the GART.
>
> Bob Montgomery
> working at HP
Hi Bob,
This problem was recently reported on a LS42 blade and the patch given by you
also resolved the issue here too. However I made couple of changes to
kexec-tools to ignore GART memory region and not have elf headers created to
it. This patch also seemed to work on a LS21.
Thanks,
Chandru
Signed-off-by: Chandru S <chandru@in.ibm.com>
---
--- kexec-tools/kexec/arch/x86_64/crashdump-x86_64.c.orig 2008-12-08
01:50:41.000000000 -0600
+++ kexec-tools/kexec/arch/x86_64/crashdump-x86_64.c 2008-12-08
03:02:45.000000000 -0600
@@ -47,7 +47,7 @@ static struct crash_elf_info elf_info =
};
/* Forward Declaration. */
-static int exclude_crash_reserve_region(int *nr_ranges);
+static int exclude_region(int *nr_ranges, uint64_t start, uint64_t end);
#define KERN_VADDR_ALIGN 0x100000 /* 1MB */
@@ -164,10 +164,11 @@ static struct memory_range crash_reserve
static int get_crash_memory_ranges(struct memory_range **range, int *ranges)
{
const char *iomem= proc_iomem();
- int memory_ranges = 0;
+ int memory_ranges = 0, gart = 0;
char line[MAX_LINE];
FILE *fp;
unsigned long long start, end;
+ uint64_t gart_start = 0, gart_end = 0;
fp = fopen(iomem, "r");
if (!fp) {
@@ -219,6 +220,10 @@ static int get_crash_memory_ranges(struc
type = RANGE_ACPI;
} else if(memcmp(str,"ACPI Non-volatile Storage\n",26) == 0 ) {
type = RANGE_ACPI_NVS;
+ } else if (memcmp(str, "GART\n", 5) == 0) {
+ gart_start = start;
+ gart_end = end;
+ gart = 1;
} else {
continue;
}
@@ -233,8 +238,14 @@ static int get_crash_memory_ranges(struc
memory_ranges++;
}
fclose(fp);
- if (exclude_crash_reserve_region(&memory_ranges) < 0)
+ if (exclude_region(&memory_ranges, crash_reserved_mem.start,
+ crash_reserved_mem.end) < 0)
return -1;
+ if (gart) {
+ /* exclude GART region if the system has one */
+ if (exclude_region(&memory_ranges, gart_start, gart_end) < 0)
+ return -1;
+ }
*range = crash_memory_range;
*ranges = memory_ranges;
#ifdef DEBUG
@@ -252,32 +263,27 @@ static int get_crash_memory_ranges(struc
/* Removes crash reserve region from list of memory chunks for whom elf
program
* headers have to be created. Assuming crash reserve region to be a single
* continuous area fully contained inside one of the memory chunks */
-static int exclude_crash_reserve_region(int *nr_ranges)
+static int exclude_region(int *nr_ranges, uint64_t start, uint64_t end)
{
int i, j, tidx = -1;
- unsigned long long cstart, cend;
struct memory_range temp_region;
- /* Crash reserved region. */
- cstart = crash_reserved_mem.start;
- cend = crash_reserved_mem.end;
-
for (i = 0; i < (*nr_ranges); i++) {
unsigned long long mstart, mend;
mstart = crash_memory_range[i].start;
mend = crash_memory_range[i].end;
- if (cstart < mend && cend > mstart) {
- if (cstart != mstart && cend != mend) {
+ if (start < mend && end > mstart) {
+ if (start != mstart && end != mend) {
/* Split memory region */
- crash_memory_range[i].end = cstart - 1;
- temp_region.start = cend + 1;
+ crash_memory_range[i].end = start - 1;
+ temp_region.start = end + 1;
temp_region.end = mend;
temp_region.type = RANGE_RAM;
tidx = i+1;
- } else if (cstart != mstart)
- crash_memory_range[i].end = cstart - 1;
+ } else if (start != mstart)
+ crash_memory_range[i].end = start - 1;
else
- crash_memory_range[i].start = cend + 1;
+ crash_memory_range[i].start = end + 1;
}
}
/* Insert split memory region, if any. */
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-12-08 15:56 ` Chandru
@ 2008-12-08 21:54 ` Vivek Goyal
2008-12-09 12:12 ` Chandru
2008-12-08 23:35 ` Bob Montgomery
1 sibling, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2008-12-08 21:54 UTC (permalink / raw)
To: Chandru; +Cc: bob.montgomery, kexec
On Mon, Dec 08, 2008 at 09:26:16PM +0530, Chandru wrote:
> On Tuesday 07 October 2008 21:29:51 Bob Montgomery wrote:
> > On Tue, 2008-10-07 at 13:24 +0000, Vivek Goyal wrote:
> > > On Tue, Oct 07, 2008 at 06:21:52PM +0530, Chandru wrote:
> > > > kdump on a quad core Opteron blade machine doesn't give a complete
> > > > vmcore on the system. All works well until we attempt to copy
> > > > /proc/vmcore to some target place ( disk , n/w ). The system
> > > > immediately resets without any OS messages after having copied few mb's
> > > > of vmcore file. Problem also occurs with 2.6.27-rc8 and latest
> > > > kexec-tools. If we pass 'mem=4G' as boot parameter to the first
> > > > kernel, then kdump succeeds in copying a readable vmcore to /var/crash.
> > >
> > > Hi Chandru,
> > >
> > > How much memory this system has got. Can you also paste the output of
> > > /proc/iomem of first kernel.
> > >
> > > Does this system has GART? So looks like we are accessing some memory
> > > area which platform does not like. (We saw issues with GART in the past.)
> > >
> > > Can you also provide /proc/vmcore ELF header (readelf output), in both
> > > the cases (mem=4G and without that).
> > >
> > > You can try putting some printk in /proc/vmcore code and see which
> > > physical memory area you are accessing when system goes bust. If in all
> > > the failure cases it is same physical memory area, then we can try to
> > > find what's so special about it.
> >
> > Or you can assume this is pretty much exactly the problem I ran into in
> > August. I've attached the patch that I'm using with our 2.6.18 kernel
> > to disable CPU-side access by the GART, which prevents the problem on
> > our Family 10H systems. You'll need to fix the directory name for
> > kernels newer than the arch/x86_64 merge.
> >
> > Now that someone else has seen the problem, if this fixes it, I'll
> > submit the patch upstream.
> >
> > Here's the README for the patch:
> >
> > This patch changes the initialization of the GART (in
> > pci-gart.c:init_k8_gatt) to set the DisGartCpu bit in the GART Aperture
> > Control Register. Setting the bit Disables requests from the CPUs from
> > accessing the GART. In other words, CPU memory accesses within the
> > range of addresses in the aperture will not cause the GART to perform an
> > address translation. The aperture area was already being unmapped at
> > the kernel level with clear_kernel_mapping() to prevent accesses from
> > the CPU, but that kernel level unmapping is not in effect in the kexec'd
> > kdump kernel. By disabling the CPU-side accesses within the GART, which
> > does persist through the kexec of the kdump kernel, the kdump kernel is
> > prevented from interacting with the GART during accesses to the dump
> > memory areas which include the address range of the GART aperture.
> > Although the patch can be applied to the kdump kernel, it is not
> > exercised there because the kdump kernel doesn't attempt to initialize
> > the GART.
> >
> > Bob Montgomery
> > working at HP
>
> Hi Bob,
>
> This problem was recently reported on a LS42 blade and the patch given by you
> also resolved the issue here too. However I made couple of changes to
> kexec-tools to ignore GART memory region and not have elf headers created to
> it. This patch also seemed to work on a LS21.
>
> Thanks,
> Chandru
>
> Signed-off-by: Chandru S <chandru@in.ibm.com>
> ---
Hi Chandru,
So this patch will solve the issue (at least for /proc/vmcore) even if
we don't make any changes on kernel side?
Thanks
Vivek
>
> --- kexec-tools/kexec/arch/x86_64/crashdump-x86_64.c.orig 2008-12-08
> 01:50:41.000000000 -0600
> +++ kexec-tools/kexec/arch/x86_64/crashdump-x86_64.c 2008-12-08
> 03:02:45.000000000 -0600
> @@ -47,7 +47,7 @@ static struct crash_elf_info elf_info =
> };
>
> /* Forward Declaration. */
> -static int exclude_crash_reserve_region(int *nr_ranges);
> +static int exclude_region(int *nr_ranges, uint64_t start, uint64_t end);
>
> #define KERN_VADDR_ALIGN 0x100000 /* 1MB */
>
> @@ -164,10 +164,11 @@ static struct memory_range crash_reserve
> static int get_crash_memory_ranges(struct memory_range **range, int *ranges)
> {
> const char *iomem= proc_iomem();
> - int memory_ranges = 0;
> + int memory_ranges = 0, gart = 0;
> char line[MAX_LINE];
> FILE *fp;
> unsigned long long start, end;
> + uint64_t gart_start = 0, gart_end = 0;
>
> fp = fopen(iomem, "r");
> if (!fp) {
> @@ -219,6 +220,10 @@ static int get_crash_memory_ranges(struc
> type = RANGE_ACPI;
> } else if(memcmp(str,"ACPI Non-volatile Storage\n",26) == 0 ) {
> type = RANGE_ACPI_NVS;
> + } else if (memcmp(str, "GART\n", 5) == 0) {
> + gart_start = start;
> + gart_end = end;
> + gart = 1;
> } else {
> continue;
> }
> @@ -233,8 +238,14 @@ static int get_crash_memory_ranges(struc
> memory_ranges++;
> }
> fclose(fp);
> - if (exclude_crash_reserve_region(&memory_ranges) < 0)
> + if (exclude_region(&memory_ranges, crash_reserved_mem.start,
> + crash_reserved_mem.end) < 0)
> return -1;
> + if (gart) {
> + /* exclude GART region if the system has one */
> + if (exclude_region(&memory_ranges, gart_start, gart_end) < 0)
> + return -1;
> + }
> *range = crash_memory_range;
> *ranges = memory_ranges;
> #ifdef DEBUG
> @@ -252,32 +263,27 @@ static int get_crash_memory_ranges(struc
> /* Removes crash reserve region from list of memory chunks for whom elf
> program
> * headers have to be created. Assuming crash reserve region to be a single
> * continuous area fully contained inside one of the memory chunks */
> -static int exclude_crash_reserve_region(int *nr_ranges)
> +static int exclude_region(int *nr_ranges, uint64_t start, uint64_t end)
> {
> int i, j, tidx = -1;
> - unsigned long long cstart, cend;
> struct memory_range temp_region;
>
> - /* Crash reserved region. */
> - cstart = crash_reserved_mem.start;
> - cend = crash_reserved_mem.end;
> -
> for (i = 0; i < (*nr_ranges); i++) {
> unsigned long long mstart, mend;
> mstart = crash_memory_range[i].start;
> mend = crash_memory_range[i].end;
> - if (cstart < mend && cend > mstart) {
> - if (cstart != mstart && cend != mend) {
> + if (start < mend && end > mstart) {
> + if (start != mstart && end != mend) {
> /* Split memory region */
> - crash_memory_range[i].end = cstart - 1;
> - temp_region.start = cend + 1;
> + crash_memory_range[i].end = start - 1;
> + temp_region.start = end + 1;
> temp_region.end = mend;
> temp_region.type = RANGE_RAM;
> tidx = i+1;
> - } else if (cstart != mstart)
> - crash_memory_range[i].end = cstart - 1;
> + } else if (start != mstart)
> + crash_memory_range[i].end = start - 1;
> else
> - crash_memory_range[i].start = cend + 1;
> + crash_memory_range[i].start = end + 1;
> }
> }
> /* Insert split memory region, if any. */
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-12-08 15:56 ` Chandru
2008-12-08 21:54 ` Vivek Goyal
@ 2008-12-08 23:35 ` Bob Montgomery
2008-12-09 1:32 ` Neil Horman
2008-12-09 11:59 ` Chandru
1 sibling, 2 replies; 13+ messages in thread
From: Bob Montgomery @ 2008-12-08 23:35 UTC (permalink / raw)
To: Chandru; +Cc: kexec@lists.infradead.org, Vivek Goyal
On Mon, 2008-12-08 at 15:56 +0000, Chandru wrote:
> Hi Bob,
>
> This problem was recently reported on a LS42 blade and the patch given by you
> also resolved the issue here too. However I made couple of changes to
> kexec-tools to ignore GART memory region and not have elf headers created to
> it. This patch also seemed to work on a LS21.
>
> Thanks,
> Chandru
Hi Chandru,
I tried your patch on kexec-tools, and I'm seeing a zero-length section
in /proc/vmcore (using readelf -e /proc/vmcore) right after the GART
hole:
/proc/iomem in the main kernel shows:
01000000-08ffffff : Crash kernel
20000000-23ffffff : GART
cfe4e000-cfe55fff : ACPI Tables
And readelf -e /proc/vmcore in the kdump kernel shows:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
...
LOAD 0x000000000144d99c 0xffff810009000000 0x0000000009000000
0x0000000017000000 0x0000000017000000 RWE 0
LOAD 0x000000001844d99c 0xffff810024000000 0x0000000024000000
0x00000000abe4e000 0x00000000abe4e000 RWE 0
LOAD 0x00000000c429b99c 0xffff810024000000 0x0000000024000000
0x0000000000000000 0x0000000000000000 RWE 0
...
The first LOAD shown covers 09000000-20000000, the System RAM between
the Crash kernel and the GART.
The next LOAD covers 24000000-cfe4e000, which is the System RAM between
the GART and the ACPI Tables. So that all looks good.
Then the next LOAD is also at 24000000 with a 0 in the size fields.
I haven't had a chance to check the code yet.
Bob Montgomery
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-12-08 23:35 ` Bob Montgomery
@ 2008-12-09 1:32 ` Neil Horman
2008-12-09 11:59 ` Chandru
1 sibling, 0 replies; 13+ messages in thread
From: Neil Horman @ 2008-12-09 1:32 UTC (permalink / raw)
To: Bob Montgomery; +Cc: Chandru, kexec@lists.infradead.org, Vivek Goyal
On Mon, Dec 08, 2008 at 04:35:16PM -0700, Bob Montgomery wrote:
> On Mon, 2008-12-08 at 15:56 +0000, Chandru wrote:
>
> > Hi Bob,
> >
> > This problem was recently reported on a LS42 blade and the patch given by you
> > also resolved the issue here too. However I made couple of changes to
> > kexec-tools to ignore GART memory region and not have elf headers created to
> > it. This patch also seemed to work on a LS21.
> >
> > Thanks,
> > Chandru
>
> Hi Chandru,
>
> I tried your patch on kexec-tools, and I'm seeing a zero-length section
> in /proc/vmcore (using readelf -e /proc/vmcore) right after the GART
> hole:
>
> /proc/iomem in the main kernel shows:
> 01000000-08ffffff : Crash kernel
> 20000000-23ffffff : GART
> cfe4e000-cfe55fff : ACPI Tables
>
> And readelf -e /proc/vmcore in the kdump kernel shows:
>
> Program Headers:
> Type Offset VirtAddr PhysAddr
> FileSiz MemSiz Flags Align
> ...
> LOAD 0x000000000144d99c 0xffff810009000000 0x0000000009000000
> 0x0000000017000000 0x0000000017000000 RWE 0
> LOAD 0x000000001844d99c 0xffff810024000000 0x0000000024000000
> 0x00000000abe4e000 0x00000000abe4e000 RWE 0
> LOAD 0x00000000c429b99c 0xffff810024000000 0x0000000024000000
> 0x0000000000000000 0x0000000000000000 RWE 0
> ...
>
> The first LOAD shown covers 09000000-20000000, the System RAM between
> the Crash kernel and the GART.
> The next LOAD covers 24000000-cfe4e000, which is the System RAM between
> the GART and the ACPI Tables. So that all looks good.
>
> Then the next LOAD is also at 24000000 with a 0 in the size fields.
> I haven't had a chance to check the code yet.
>
> Bob Montgomery
>
Seems like something we can work out pretty easily. Did crash choke on this at
all?
Neil
>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
/***************************************************
*Neil Horman
*Senior Software Engineer
*Red Hat, Inc.
*nhorman@redhat.com
*gpg keyid: 1024D / 0x92A74FA1
*http://pgp.mit.edu
***************************************************/
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-12-08 23:35 ` Bob Montgomery
2008-12-09 1:32 ` Neil Horman
@ 2008-12-09 11:59 ` Chandru
1 sibling, 0 replies; 13+ messages in thread
From: Chandru @ 2008-12-09 11:59 UTC (permalink / raw)
To: bob.montgomery; +Cc: kexec@lists.infradead.org, Vivek Goyal
On Tuesday 09 December 2008 05:05:16 Bob Montgomery wrote:
> On Mon, 2008-12-08 at 15:56 +0000, Chandru wrote:
> > Hi Bob,
> >
> > This problem was recently reported on a LS42 blade and the patch given by
> > you also resolved the issue here too. However I made couple of changes
> > to kexec-tools to ignore GART memory region and not have elf headers
> > created to it. This patch also seemed to work on a LS21.
> >
> > Thanks,
> > Chandru
>
> Hi Chandru,
>
> I tried your patch on kexec-tools, and I'm seeing a zero-length section
> in /proc/vmcore (using readelf -e /proc/vmcore) right after the GART
> hole:
>
> /proc/iomem in the main kernel shows:
> 01000000-08ffffff : Crash kernel
> 20000000-23ffffff : GART
> cfe4e000-cfe55fff : ACPI Tables
>
> And readelf -e /proc/vmcore in the kdump kernel shows:
>
> Program Headers:
> Type Offset VirtAddr PhysAddr
> FileSiz MemSiz Flags Align
> ...
> LOAD 0x000000000144d99c 0xffff810009000000 0x0000000009000000
> 0x0000000017000000 0x0000000017000000 RWE 0
> LOAD 0x000000001844d99c 0xffff810024000000 0x0000000024000000
> 0x00000000abe4e000 0x00000000abe4e000 RWE 0
> LOAD 0x00000000c429b99c 0xffff810024000000 0x0000000024000000
> 0x0000000000000000 0x0000000000000000 RWE 0
> ...
>
> The first LOAD shown covers 09000000-20000000, the System RAM between
> the Crash kernel and the GART.
> The next LOAD covers 24000000-cfe4e000, which is the System RAM between
> the GART and the ACPI Tables. So that all looks good.
>
> Then the next LOAD is also at 24000000 with a 0 in the size fields.
> I haven't had a chance to check the code yet.
>
> Bob Montgomery
Whoops, a 'continue' was missing in the patch. Here is the updated patch...
Exclude GART memory region and make kexec-tools to not create elf headers to
it. Currently it seems like the dump analysis tools do not need a copy of the
GART memory region, hence ignoring it in kexec-tools.
Signed-off-by: Chandru S <chandru@in.ibm.com>
---
--- kexec-tools/kexec/arch/x86_64/crashdump-x86_64.c.orig 2008-12-08
01:50:41.000000000 -0600
+++ kexec-tools/kexec/arch/x86_64/crashdump-x86_64.c 2008-12-08
22:01:50.000000000 -0600
@@ -47,7 +47,7 @@ static struct crash_elf_info elf_info =
};
/* Forward Declaration. */
-static int exclude_crash_reserve_region(int *nr_ranges);
+static int exclude_region(int *nr_ranges, uint64_t start, uint64_t end);
#define KERN_VADDR_ALIGN 0x100000 /* 1MB */
@@ -164,10 +164,11 @@ static struct memory_range crash_reserve
static int get_crash_memory_ranges(struct memory_range **range, int *ranges)
{
const char *iomem= proc_iomem();
- int memory_ranges = 0;
+ int memory_ranges = 0, gart = 0;
char line[MAX_LINE];
FILE *fp;
unsigned long long start, end;
+ uint64_t gart_start = 0, gart_end = 0;
fp = fopen(iomem, "r");
if (!fp) {
@@ -219,6 +220,11 @@ static int get_crash_memory_ranges(struc
type = RANGE_ACPI;
} else if(memcmp(str,"ACPI Non-volatile Storage\n",26) == 0 ) {
type = RANGE_ACPI_NVS;
+ } else if (memcmp(str, "GART\n", 5) == 0) {
+ gart_start = start;
+ gart_end = end;
+ gart = 1;
+ continue;
} else {
continue;
}
@@ -233,8 +239,14 @@ static int get_crash_memory_ranges(struc
memory_ranges++;
}
fclose(fp);
- if (exclude_crash_reserve_region(&memory_ranges) < 0)
+ if (exclude_region(&memory_ranges, crash_reserved_mem.start,
+ crash_reserved_mem.end) < 0)
return -1;
+ if (gart) {
+ /* exclude GART region if the system has one */
+ if (exclude_region(&memory_ranges, gart_start, gart_end) < 0)
+ return -1;
+ }
*range = crash_memory_range;
*ranges = memory_ranges;
#ifdef DEBUG
@@ -252,32 +264,27 @@ static int get_crash_memory_ranges(struc
/* Removes crash reserve region from list of memory chunks for whom elf
program
* headers have to be created. Assuming crash reserve region to be a single
* continuous area fully contained inside one of the memory chunks */
-static int exclude_crash_reserve_region(int *nr_ranges)
+static int exclude_region(int *nr_ranges, uint64_t start, uint64_t end)
{
int i, j, tidx = -1;
- unsigned long long cstart, cend;
struct memory_range temp_region;
- /* Crash reserved region. */
- cstart = crash_reserved_mem.start;
- cend = crash_reserved_mem.end;
-
for (i = 0; i < (*nr_ranges); i++) {
unsigned long long mstart, mend;
mstart = crash_memory_range[i].start;
mend = crash_memory_range[i].end;
- if (cstart < mend && cend > mstart) {
- if (cstart != mstart && cend != mend) {
+ if (start < mend && end > mstart) {
+ if (start != mstart && end != mend) {
/* Split memory region */
- crash_memory_range[i].end = cstart - 1;
- temp_region.start = cend + 1;
+ crash_memory_range[i].end = start - 1;
+ temp_region.start = end + 1;
temp_region.end = mend;
temp_region.type = RANGE_RAM;
tidx = i+1;
- } else if (cstart != mstart)
- crash_memory_range[i].end = cstart - 1;
+ } else if (start != mstart)
+ crash_memory_range[i].end = start - 1;
else
- crash_memory_range[i].start = cend + 1;
+ crash_memory_range[i].start = end + 1;
}
}
/* Insert split memory region, if any. */
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-12-08 21:54 ` Vivek Goyal
@ 2008-12-09 12:12 ` Chandru
2008-12-09 18:55 ` Bob Montgomery
0 siblings, 1 reply; 13+ messages in thread
From: Chandru @ 2008-12-09 12:12 UTC (permalink / raw)
To: Vivek Goyal; +Cc: bob.montgomery, kexec
On Tuesday 09 December 2008 03:24:32 Vivek Goyal wrote:
> Hi Chandru,
>
> So this patch will solve the issue (at least for /proc/vmcore) even if
> we don't make any changes on kernel side?
>
> Thanks
> Vivek
Hi Vivek,
Yes, without any changes to the kernel and with this patch, we can boot into
kdump kernel on these quad core Opteron machines and collect a dump.
thanks,
Chandru
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: kdump: quad core Opteron
2008-12-09 12:12 ` Chandru
@ 2008-12-09 18:55 ` Bob Montgomery
0 siblings, 0 replies; 13+ messages in thread
From: Bob Montgomery @ 2008-12-09 18:55 UTC (permalink / raw)
To: Chandru; +Cc: kexec@lists.infradead.org, Vivek Goyal
On Tue, 2008-12-09 at 12:12 +0000, Chandru wrote:
> On Tuesday 09 December 2008 03:24:32 Vivek Goyal wrote:
>
> > Hi Chandru,
> >
> > So this patch will solve the issue (at least for /proc/vmcore) even if
> > we don't make any changes on kernel side?
> >
> > Thanks
> > Vivek
>
> Hi Vivek,
>
> Yes, without any changes to the kernel and with this patch, we can boot into
> kdump kernel on these quad core Opteron machines and collect a dump.
Provided that the kernel is new enough to have the "Insert GART region
into resource map" already. Otherwise, you will need to change the
kernel for this to work.
Chandru, thanks for coding this up.
Bob Montgomery
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2008-12-09 18:51 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-07 12:51 kdump: quad core Opteron Chandru
2008-10-07 13:06 ` Bernhard Walle
2008-10-07 13:24 ` Vivek Goyal
2008-10-07 15:59 ` Bob Montgomery
2008-10-08 13:51 ` Chandru
2008-12-08 15:56 ` Chandru
2008-12-08 21:54 ` Vivek Goyal
2008-12-09 12:12 ` Chandru
2008-12-09 18:55 ` Bob Montgomery
2008-12-08 23:35 ` Bob Montgomery
2008-12-09 1:32 ` Neil Horman
2008-12-09 11:59 ` Chandru
2008-10-08 13:40 ` Chandru
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox