Re: HPET regression in 2.6.26 versus 2.6.25 -- connection between HPET and lockups found

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: HPET regression in 2.6.26 versus 2.6.25 -- connection between HPET and lockups found
@ 2008-08-19 12:49 David Witbrodt
  2008-08-19 13:08 ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: David Witbrodt @ 2008-08-19 12:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Yinghai Lu, linux-kernel, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner, H. Peter Anvin, netdev

> Just to make sure: on a working kernel, do you get the HPET messages? 
> I.e. does the hpet truly work in that case?

On the "fileserver", where 2.6.25 works but 2.6.26 locks up, the HPET
_does_ work on a working kernel:

$ uname -r
2.6.26.revert1

$ dmesg | grep -i hpet
ACPI: HPET 77FE80C0, 0038 (r1 RS690  AWRDACPI 42302E31 AWRD       98)
ACPI: HPET id: 0x10b9a201 base: 0xfed00000
hpet clockevent registered
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0
hpet0: 4 32-bit timers, 14318180 Hz
hpet_resources: 0xfed00000 is busy

What I didn't realize is that the "desktop" machine, where 2.6.26 has
always "worked", does NOT have a working HPET after all, even though 
I have enabled all HPET options in .config:

$ uname -r
2.6.26.080801.desktop.uvesafb

$ dmesg | grep -i hpet
$ 

This means I misunderstood my situation on "desktop".  I believed HPET
was working on all of my machines, but now I am not certain that it 
ever worked on "desktop" since I built it (May 2007).  The question 
never arose before, and because I enabled the HPET option in .config,
I just assumed that HPET was working.  (Duh...)  I failed to look into
this until now.

At any rate, my subject line is still accurate:  there _is_ an HPET
regression on "fileserver" (and "webserver"), since it worked on
2.6.25 kernels but causes lockups on 2.6.2[67] kernels.

(I don't know what is going on with "desktop":  does the motherboard
lack HPET, or does the Linux kernel not support the HPET hardware on 
the motherboard?)

BTW:  the 'dmesg' output above is the same on "desktop" with 2.6.25
and 2.6.26 -- I just checked to be sure.  For "fileserver", I checked
an old 2.6.25 kernel just now, and the output is identical.

Another experiment:  I just tried this...

static __init int hpet_insert_resource(void)
{
-     if (!hpet_res)
+     /* if (!hpet_res) */
         return 1;

-    return insert_resource(&iomem_resource, hpet_res);
+    /* return insert_resource(&iomem_resource, hpet_res); */
}

... and the lock up still occurs.  So, the memory is allocated but
the resource info is not inserted into the tree.  Whether the
dynamic memory for hpet_res is being damaged or not has no bearing
on the lockups, it would seem.  Looks like I was barking up the
wrong tree....

Dave W.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: HPET regression in 2.6.26 versus 2.6.25 -- connection between HPET and lockups found
  2008-08-19 12:49 HPET regression in 2.6.26 versus 2.6.25 -- connection between HPET and lockups found David Witbrodt
@ 2008-08-19 13:08 ` Ingo Molnar
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2008-08-19 13:08 UTC (permalink / raw)
  To: David Witbrodt
  Cc: Yinghai Lu, linux-kernel, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner, H. Peter Anvin, netdev


* David Witbrodt <dawitbro@sbcglobal.net> wrote:

> > Just to make sure: on a working kernel, do you get the HPET 
> > messages? I.e. does the hpet truly work in that case?
> 
> On the "fileserver", where 2.6.25 works but 2.6.26 locks up, the HPET 
> _does_ work on a working kernel:
> 
> $ uname -r
> 2.6.26.revert1
> 
> $ dmesg | grep -i hpet
> ACPI: HPET 77FE80C0, 0038 (r1 RS690  AWRDACPI 42302E31 AWRD       98)
> ACPI: HPET id: 0x10b9a201 base: 0xfed00000
> hpet clockevent registered
> hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0
> hpet0: 4 32-bit timers, 14318180 Hz
> hpet_resources: 0xfed00000 is busy

btw., you might also want to look into drivers/char/hpet.c and 
instrument that a bit. In particular the ioremap()s done there will show 
exactly how the hpet is mapped.

In particular this bit:

                if (hpet_is_known(hdp)) {
                        printk(KERN_DEBUG "%s: 0x%lx is busy\n",
                                __func__, hdp->hd_phys_address);
                        iounmap(hdp->hd_address);
                        return AE_ALREADY_EXISTS;
                }


suggests that you've got multiple hpets listed by the BIOS?

> What I didn't realize is that the "desktop" machine, where 2.6.26 has 
> always "worked", does NOT have a working HPET after all, even though I 
> have enabled all HPET options in .config:

that's OK - you've still got a regression.

> (I don't know what is going on with "desktop": does the motherboard 
> lack HPET, or does the Linux kernel not support the HPET hardware on 
> the motherboard?)

Whether the system has a hpet listed in the BIOS data structures can be 
seen in acpidump [in the pmtools package]

Even if the BIOS does not list it, the system might have hpet in the 
chipset - hpet=force can be tried to force-enable it.

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: HPET regression in 2.6.26 versus 2.6.25 -- connection between HPET and lockups found
@ 2008-08-19  3:51 David Witbrodt
  2008-08-19  9:23 ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: David Witbrodt @ 2008-08-19  3:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Yinghai Lu, linux-kernel, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner, H. Peter Anvin, netdev

> > Does this connection between HPET and insert_resource() look 
> > meaningful, or is this a coincidence?
> 
> it is definitely the angle i'd suspect the most.
> 
> perhaps we stomp over some piece of memory that is "available RAM" 
> according to your BIOS, but in reality is used by something. With 
> previous kernels we got lucky and have put a data structure there which 
> kept your hpet still working. (a bit far-fetched i think, but the best 
> theory i could come up with)

Working... or NOT working.  Tonight I noticed something strange about 
my desktop machine, which _works_ with 2.6.2[67] tonight:  even though 
it shares the same HPET .config settings with the 2 problem machines,

CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_HPET=y
CONFIG_HPET_RTC_IRQ=y
CONFIG_HPET_MMAP=y

apparently no HPET device gets configured by the kernel:

$ dmesg | grep -i hpet
$

In contrast, I get this on the 2 "bad" machines if using the 2.6.26
kernel with the 2 problem commits reverted:

$ dmesg | grep -i hpet
ACPI: HPET 77FE80C0, 0038 (r1 RS690  AWRDACPI 42302E31 AWRD       98)
ACPI: HPET id: 0x10b9a201 base: 0xfed00000
hpet clockevent registered
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0
hpet0: 4 32-bit timers, 14318180 Hz
hpet_resources: 0xfed00000 is busy

That makes it looks like my third machine might have locked up with 
2.6.2[67] as well, but some problem configuring HPET actually prevents
it from locking up.  I wonder how widespread this badness really is 
after all?!  Are we not seeing more reports of lockups simply because 
people are getting lucky on AMD dual core machines, and having their
HPET _fail_ instead of their kernel locking up?

> the address you printed out (0xffff88000100f000), does look _somewhat_ 
> suspicious. It corresponds to the physical address of 0x100f000. That is 
> _just_ above the 16MB boundary. It should not be relevant normally - but 
> it's still somewhat suspicious.

I guess I was hitting around about the upper 32 bits -- I take it that
these pointers are virtualized, and the upper half is some sort of
descriptor?  In that pointer was in a flat memory model, then it would be
pointing _way_ past the end of my 2 GB of RAM, which would end around
0x0000000080000000.

I am not used to looking at raw pointer addresses, just pointer variable 
names.  I think I was recalling the /proc/iomem data that Yinghai asked 
for, but this stuff is just offsets stripped of descriptors, huh?:

$ cat /proc/iomem
00000000-0009f3ff : System RAM
0009f400-0009ffff : reserved
000f0000-000fffff : reserved
00100000-77fdffff : System RAM
  00200000-0056ca21 : Kernel code
  0056ca22-006ce3d7 : Kernel data
  00753000-0079a3c7 : Kernel bss
77fe0000-77fe2fff : ACPI Non-volatile Storage
77fe3000-77feffff : ACPI Tables
77ff0000-77ffffff : reserved
78000000-7fffffff : pnp 00:0d
d8000000-dfffffff : PCI Bus #01
  d8000000-dfffffff : 0000:01:05.0
    d8000000-d8ffffff : uvesafb
e0000000-efffffff : PCI MMCONFIG 0
  e0000000-efffffff : reserved
fdc00000-fdcfffff : PCI Bus #02
  fdcff000-fdcff0ff : 0000:02:05.0
    fdcff000-fdcff0ff : r8169
fdd00000-fdefffff : PCI Bus #01
  fdd00000-fddfffff : 0000:01:05.0
  fdee0000-fdeeffff : 0000:01:05.0
  fdefc000-fdefffff : 0000:01:05.2
    fdefc000-fdefffff : ICH HD audio
fdf00000-fdffffff : PCI Bus #02
fe020000-fe023fff : 0000:00:14.2
  fe020000-fe023fff : ICH HD audio
fe029000-fe0290ff : 0000:00:13.5
  fe029000-fe0290ff : ehci_hcd
fe02a000-fe02afff : 0000:00:13.4
  fe02a000-fe02afff : ohci_hcd
fe02b000-fe02bfff : 0000:00:13.3
  fe02b000-fe02bfff : ohci_hcd
fe02c000-fe02cfff : 0000:00:13.2
  fe02c000-fe02cfff : ohci_hcd
fe02d000-fe02dfff : 0000:00:13.1
  fe02d000-fe02dfff : ohci_hcd
fe02e000-fe02efff : 0000:00:13.0
  fe02e000-fe02efff : ohci_hcd
fe02f000-fe02f3ff : 0000:00:12.0
  fe02f000-fe02f3ff : ahci
fec00000-fec00fff : IOAPIC 0
  fec00000-fec00fff : pnp 00:0d
fed00000-fed003ff : HPET 0
  fed00000-fed003ff : 0000:00:14.0
fee00000-fee00fff : Local APIC
fff80000-fffeffff : pnp 00:0d
ffff0000-ffffffff : pnp 00:0d

> To test this theory, could you tweak this:
> 
>   alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);
> 
> to be:
> 
>   alloc_bootmem_low(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);
> 
> this will allocate the hpet resource descriptor in lower RAM.

Results:  strange... still locked up, and more or less the same output,
especially the same address!:

Data from arch/x86/kernel/acpi/boot.c:
  hpet_res = ffff88000100f000    requested size: 65
  sequence = 0    insert_resource() returned:  0
  broken_bios: 0

Here is a section of 'git diff arch/x86/kernel/acpi/bootc' to
verify that I _did_ make the change:

===== BEGIN DIFF =============
@@ -701,13 +711,16 @@ static int __init acpi_parse_hpet(struct acpi_table_header *table)
      * the resource tree during the lateinit timeframe.
      */
 #define HPET_RESOURCE_NAME_SIZE 9
-    hpet_res = alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);
+    hpet_res = alloc_bootmem_low (sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);
+    dw_hpet_res = hpet_res;
+    dw_req_size = sizeof (*hpet_res) + HPET_RESOURCE_NAME_SIZE;

     hpet_res->name = (void *)&hpet_res[1];
     hpet_res->flags = IORESOURCE_MEM;
     snprintf((char *)hpet_res->name, HPET_RESOURCE_NAME_SIZE, "HPET %u",
          hpet_tbl->sequence);
===== END DIFF =============

It's like the change to alloc_bootmem_low made no difference at all!

The Aug. 12 messages I saw about alloc_bootmem() had to do with alignment
issues on 1 GB boundaries on x86_64 NUMA machines.  I certainly do have
x86_64 NUMA machines, but the behavior above seems to have nothing to do
with alignment issues.

> Another idea: could you increase HPET_RESOURCE_NAME_SIZE from 9 to 
> something larger (via the patch below)? Maybe the bug is that this 
> overflows:
> 
>         snprintf((char *)hpet_res->name, HPET_RESOURCE_NAME_SIZE, "HPET %u",
>                  hpet_tbl->sequence);
> 
> and corrupts the memory next to the hpet resource descriptor.

I noticed the potential for sequence to overflow the 9 byte buffer size
right away.  I got my hopes up... until I looked in include/acpi/actbl1.h:

struct acpi_table_hpet {
        struct acpi_table_header  header;
        u32  id;
        struct acpi_generic_address  address;
        u8  sequence;
        u16  minimum_tick;
        u8  flags;
};

The original programmer set HPET_RESOURCE_NAME_SIZE to 9 because the
combined length of "HPET " and a u8 is guaranteed to be <= 8.  I have
applied the change, nevertheless:

> @@ -700,7 +700,7 @@ static int __init acpi_parse_hpet(struct acpi_table_header 
> *table)
>      * Allocate and initialize the HPET firmware resource for adding into
>      * the resource tree during the lateinit timeframe.
>      */
> -#define HPET_RESOURCE_NAME_SIZE 9
> +#define HPET_RESOURCE_NAME_SIZE 14
>     hpet_res = alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);

Results:  locked up

Data from arch/x86/kernel/acpi/boot.c:
  hpet_res = ffff88000100f000    requested size: 70
  sequence = 0    insert_resource() returned:  0
  broken_bios: 0

> Also, you could try to increase the bootmem allocation drastically, by 
> say 16*1024 bytes, via:
> 
>     hpet_res = alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE + 
> 16*1024);
>         hpet_res = (void *)hpet_res + 15*1024;
> 
> this will pad the memory at ~16MB and not use it for any resource. 
> Arguably a really weird hack, but i'm running out of ideas ...

I tried this:

-    hpet_res = alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);
+    hpet_res = alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE + 16*1024);
+    hpet_res = (void*) hpet_res + 1024;

Results:  locked up

Data from arch/x86/kernel/acpi/boot.c:
  hpet_res = ffff88000100f400    requested size: 70
  sequence = 0    insert_resource() returned:  0
  broken_bios: 0

It looks like this resource does not get mangled, but maybe others are.

In a weekend experiment (for which I didn't post results), I recursed the
iomem_resource tree -- struggling to get all of the output to fit on one
80x25 screen.  Everything there seemed to be intact, with the addresses
matching the output of 'cat /proc/iomem' on a working kernel... except
(naturally) for some missing resources because the kernel locks before
getting to them.

But what does any of this have to do with the fact that the lockup occurs
in synchronize_rcu()?????  Madness... MADNESS!!!!!

[Old issue]  No one responded when I asked for some help with 'git' to
move my reverts up from "v2.6.26" to the HEAD of origin/master (or
tip/master).  Did you see that question, and do you have any advice?

Thanks Ingo,
Dave W.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: HPET regression in 2.6.26 versus 2.6.25 -- connection between HPET and lockups found
  2008-08-19  3:51 David Witbrodt
@ 2008-08-19  9:23 ` Ingo Molnar
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2008-08-19  9:23 UTC (permalink / raw)
  To: David Witbrodt
  Cc: Yinghai Lu, linux-kernel, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner, H. Peter Anvin, netdev

* David Witbrodt <dawitbro@sbcglobal.net> wrote:

> > the address you printed out (0xffff88000100f000), does look 
> > _somewhat_ suspicious. It corresponds to the physical address of 
> > 0x100f000. That is _just_ above the 16MB boundary. It should not be 
> > relevant normally - but it's still somewhat suspicious.
> 
> I guess I was hitting around about the upper 32 bits -- I take it that 
> these pointers are virtualized, and the upper half is some sort of 
> descriptor?  In that pointer was in a flat memory model, then it would 
> be pointing _way_ past the end of my 2 GB of RAM, which would end 
> around 0x0000000080000000.

correct, the 64-bit "flat" physical addresses are mapped with a shift: 
they are shifted down into negative addresses, starting at:

earth4:~/tip> grep PAGE_OFFSET include/asm-x86/page_64.h
#define __PAGE_OFFSET           _AC(0xffff880000000000, UL)

i.e. physical address zero is mapped to "minus 120 terabytes". [we do 
this on the 64-bit kernel to get out of the way of the application 
address space, which goes from the usual zero.]

All in one, 0xffff88000100f000 is a regular kernel address that 
corresponds to the physical address of 0x100f000 - i.e. 16 MB plus 
15*4KB.

> I am not used to looking at raw pointer addresses, just pointer variable 
> names.  I think I was recalling the /proc/iomem data that Yinghai asked 
> for, but this stuff is just offsets stripped of descriptors, huh?:
> 
> $ cat /proc/iomem
> fed00000-fed003ff : HPET 0
>   fed00000-fed003ff : 0000:00:14.0

correct - these resource descriptors are in the "physical address" space 
(system RAM, chipset decoded addresses, device decoded addresses, etc.).

fed00000-fed003ff means that your HPET hardware sits at physical address 
4275044352, or just below 4GB. That is the usual place for such non-RAM 
device memory - it does not get in the way of normal RAM.

> It's like the change to alloc_bootmem_low made no difference at all!
> 
> The Aug. 12 messages I saw about alloc_bootmem() had to do with 
> alignment issues on 1 GB boundaries on x86_64 NUMA machines.  I 
> certainly do have x86_64 NUMA machines, but the behavior above seems 
> to have nothing to do with alignment issues.

the resource descriptor is really a kernel internal abstraction - it's 
just a memory buffer we put the hpet address into. It's in essence used 
for /proc/iomem output, not much else. So it _should_ not have any 
effects.

the real difference is likely that the hpet hardware is activated on 
your box - and apparently is causing problems.

> Results: locked up

:-/

Just to make sure: on a working kernel, do you get the HPET messages? 
I.e. does the hpet truly work in that case?

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: HPET regression in 2.6.26 versus 2.6.25 -- connection between HPET and lockups found
@ 2008-08-19  0:34 David Witbrodt
  2008-08-19  1:14 ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: David Witbrodt @ 2008-08-19  0:34 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: linux-kernel, Ingo Molnar, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner, H. Peter Anvin, netdev


As part of my experiments to determine the root cause of my lockups,
I was searching through the kernel sources trying to discover any
connection between the changes in the commits introducing the lockups
(3def3d6d... and 1e934dda...) and the fact that "hpet=disable" 
alleviates the lockups.

I finally discovered something that looks promising!


Both of those commits introduce changes involving insert_resource(),
and I found the function hpet_insert_resource() in
arch/x86/kernel/acpi/boot.c that also uses insert_resource():

static __init int hpet_insert_resource(void)
{
        if (!hpet_res)
                return 1;

        return insert_resource(&iomem_resource, hpet_res);
}


The effect of "hpet=disable" is to prevent the hpet_res pointer,

    static struct __initdata resource *hpet_res;

from being attached to memory, keeping it NULL and causing the
return value to indicate that the HPET resource was not assigned.

When not using "hpet=disable", the memory location of hpet_res
is added to the iomem_resource tree.  The code that obtains the
memory for hpet_res is in the same file, in the lines immediately
preceding:

static int __init acpi_parse_hpet(struct acpi_table_header *table)
{
        struct acpi_table_hpet *hpet_tbl;

...
#define HPET_RESOURCE_NAME_SIZE 9
        hpet_res = alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);

...
        return 0;
}


Trying to discover if something was going haywire in this part of the code,
I tried to capture some data which I could save until just before the kernel
locks so that I could printk() it and still see it without having it scroll
off the top:

===== BEGIN DIFF ==========
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 9d3528c..c4670a6 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -644,6 +644,11 @@ static int __init acpi_parse_sbf(struct acpi_table_header *table)
 
 static struct __initdata resource *hpet_res;
 
+extern void *dw_hpet_res;
+extern int dw_broken_bios;
+extern unsigned dw_seq;
+extern unsigned dw_req_size;
+
 static int __init acpi_parse_hpet(struct acpi_table_header *table)
 {
     struct acpi_table_hpet *hpet_tbl;
@@ -672,6 +677,9 @@ static int __init acpi_parse_hpet(struct acpi_table_header *table)
                hpet_tbl->id, hpet_address);
         return 0;
     }
+
+    dw_broken_bios = 0;
+
 #ifdef CONFIG_X86_64
     /*
      * Some even more broken BIOSes advertise HPET at
@@ -679,6 +687,8 @@ static int __init acpi_parse_hpet(struct acpi_table_header *table)
      * some noise:
      */
     if (hpet_address == 0xfed0000000000000UL) {
+            dw_broken_bios = 1;
+
         if (!hpet_force_user) {
             printk(KERN_WARNING PREFIX "HPET id: %#x "
                    "base: 0xfed0000000000000 is bogus\n "
@@ -702,12 +712,15 @@ static int __init acpi_parse_hpet(struct acpi_table_header *table)
      */
 #define HPET_RESOURCE_NAME_SIZE 9
     hpet_res = alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);
+    dw_hpet_res = hpet_res;
+    dw_req_size = sizeof (*hpet_res) + HPET_RESOURCE_NAME_SIZE;
 
     hpet_res->name = (void *)&hpet_res[1];
     hpet_res->flags = IORESOURCE_MEM;
     snprintf((char *)hpet_res->name, HPET_RESOURCE_NAME_SIZE, "HPET %u",
          hpet_tbl->sequence);
 
+    dw_seq = hpet_tbl->sequence;
     hpet_res->start = hpet_address;
     hpet_res->end = hpet_address + (1 * 1024) - 1;
 
@@ -718,12 +731,19 @@ static int __init acpi_parse_hpet(struct acpi_table_header *table)
  * hpet_insert_resource inserts the HPET resources used into the resource
  * tree.
  */
+extern int dw_ir_retval;
+
 static __init int hpet_insert_resource(void)
 {
+        int retval;
+
     if (!hpet_res)
         return 1;
 
-    return insert_resource(&iomem_resource, hpet_res);
+    retval = insert_resource(&iomem_resource, hpet_res);
+    dw_ir_retval = retval;
+
+    return retval;
 }
 
 late_initcall(hpet_insert_resource);
diff --git a/net/core/dev.c b/net/core/dev.c
index 600bb23..fe27b94 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4304,10 +4304,21 @@ void free_netdev(struct net_device *dev)
     put_device(&dev->dev);
 }
 
+void *dw_hpet_res;
+int dw_broken_bios;
+unsigned dw_seq;
+int dw_ir_retval;
+unsigned dw_req_size;
+
 /* Synchronize with packet receive processing. */
 void synchronize_net(void)
 {
     might_sleep();
+
+    printk ("Data from arch/x86/kernel/acpi/boot.c:\n");
+    printk ("  hpet_res = %p    requested size: %u\n", dw_hpet_res, dw_req_size);
+    printk ("  sequence = %u    insert_resource() returned:  %d\n", dw_seq, dw_ir_retval);
+        printk ("  broken_bios: %d\n", dw_broken_bios);
     synchronize_rcu();
 }
===== END DIFF ==========


The output I get when the kernel locks up looks perfectly OK, except
maybe for the address of hpet_res (which I am not knowledgeable enough
to judge):

Data from arch/x86/kernel/acpi/boot.c:
  hpet_res = ffff88000100f000    broken_bios: 0
  sequence = 0    insert_resource() returned: 0


I see some recent (Aug. 2008) discussion of alloc_bootmem() being 
broken, so maybe that is related to my problem.

Does this connection between HPET and insert_resource() look meaningful,
or is this a coincidence?


Thanks,
Dave W.

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: HPET regression in 2.6.26 versus 2.6.25 -- connection between HPET and lockups found
  2008-08-19  0:34 David Witbrodt
@ 2008-08-19  1:14 ` Ingo Molnar
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2008-08-19  1:14 UTC (permalink / raw)
  To: David Witbrodt
  Cc: Yinghai Lu, linux-kernel, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner, H. Peter Anvin, netdev

* David Witbrodt <dawitbro@sbcglobal.net> wrote:

> The output I get when the kernel locks up looks perfectly OK, except 
> maybe for the address of hpet_res (which I am not knowledgeable enough 
> to judge):
> 
> Data from arch/x86/kernel/acpi/boot.c:
>   hpet_res = ffff88000100f000    broken_bios: 0
>   sequence = 0    insert_resource() returned: 0
> 
> 
> I see some recent (Aug. 2008) discussion of alloc_bootmem() being 
> broken, so maybe that is related to my problem.
> 
> Does this connection between HPET and insert_resource() look 
> meaningful, or is this a coincidence?

it is definitely the angle i'd suspect the most.

perhaps we stomp over some piece of memory that is "available RAM" 
according to your BIOS, but in reality is used by something. With 
previous kernels we got lucky and have put a data structure there which 
kept your hpet still working. (a bit far-fetched i think, but the best 
theory i could come up with)

the address you printed out (0xffff88000100f000), does look _somewhat_ 
suspicious. It corresponds to the physical address of 0x100f000. That is 
_just_ above the 16MB boundary. It should not be relevant normally - but 
it's still somewhat suspicious.

To test this theory, could you tweak this:

  alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);

to be:

  alloc_bootmem_low(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);

this will allocate the hpet resource descriptor in lower RAM.

Another idea: could you increase HPET_RESOURCE_NAME_SIZE from 9 to 
something larger (via the patch below)? Maybe the bug is that this 
overflows:

        snprintf((char *)hpet_res->name, HPET_RESOURCE_NAME_SIZE, "HPET %u",
                 hpet_tbl->sequence);

and corrupts the memory next to the hpet resource descriptor. Depending 
on random details of the kernel, this might or might not turn into some 
real problem. The way of allocating the resource and its name string 
together in a bootmem allocation is a bit quirky - but should be Ok 
otherwise.

Hm, i see you have printed out hpet_tbl->sequence, and that gives 0, 
which should be borderline OK in terms of overflow. Cannot hurt to add 
this patch to your queue of test-patches :-/

Also, you could try to increase the bootmem allocation drastically, by 
say 16*1024 bytes, via:

 	hpet_res = alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE + 16*1024);
        hpet_res = (void *)hpet_res + 15*1024;

this will pad the memory at ~16MB and not use it for any resource. 
Arguably a really weird hack, but i'm running out of ideas ...

	Ingo

------------------>
>From 6319ee82bc363e2fd356782dacc9e01e5b33694e Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Tue, 19 Aug 2008 03:10:51 +0200
Subject: [PATCH] hpet: increase HPET_RESOURCE_NAME_SIZE

only had enough space for a 4 digit sprintf. If the index is wider
for any reason, we'll corrupt memory ...

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/acpi/boot.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 9d3528c..f6350aa 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -700,7 +700,7 @@ static int __init acpi_parse_hpet(struct acpi_table_header *table)
 	 * Allocate and initialize the HPET firmware resource for adding into
 	 * the resource tree during the lateinit timeframe.
 	 */
-#define HPET_RESOURCE_NAME_SIZE 9
+#define HPET_RESOURCE_NAME_SIZE 14
 	hpet_res = alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);

 	hpet_res->name = (void *)&hpet_res[1];

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-08-19 13:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-19 12:49 HPET regression in 2.6.26 versus 2.6.25 -- connection between HPET and lockups found David Witbrodt
2008-08-19 13:08 ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2008-08-19  3:51 David Witbrodt
2008-08-19  9:23 ` Ingo Molnar
2008-08-19  0:34 David Witbrodt
2008-08-19  1:14 ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).