From: David Witbrodt <dawitbro@sbcglobal.net>
To: Ingo Molnar <mingo@elte.hu>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>,
linux-kernel@vger.kernel.org,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>, netdev <netdev@vger.kernel.org>
Subject: Re: HPET regression in 2.6.26 versus 2.6.25 -- found another user with the same regression
Date: Tue, 19 Aug 2008 21:51:43 -0700 (PDT) [thread overview]
Message-ID: <109483.89278.qm@web82104.mail.mud.yahoo.com> (raw)
> > $ dmesg | grep -i hpet
> > ACPI: HPET 77FE80C0, 0038 (r1 RS690 AWRDACPI 42302E31 AWRD 98)
> > ACPI: HPET id: 0x10b9a201 base: 0xfed00000
> > hpet clockevent registered
> > hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0
> > hpet0: 4 32-bit timers, 14318180 Hz
> > hpet_resources: 0xfed00000 is busy
>
> btw., you might also want to look into drivers/char/hpet.c and
> instrument that a bit. In particular the ioremap()s done there will show
> exactly how the hpet is mapped.
Well, I exhausted about 80% of the list of experiments I put together
on Saturday, but I still can't make a 2.6.2[67] kernel boot without
"hpet=disable" or reverting commits 3def3d6d... and 1e934dda...
I spent so many hours on this today... My head is spinning, and I'm
afraid springs and smoke will start emanating from my hard drive soon
from all the recompiling and rebooting!
I need to warn you all: I discovered today, for the first time, that
I am not the first user to report this bug. This guy got bit by it
back in May, at version 2.6.26-rc2:
[blog] http://ciaranm.wordpress.com/tag/f-i90hd/
[LKML] http://www.uwsg.indiana.edu/hypermail/linux/kernel/0805.2/0746.html
He has different hardware from mine, so when 2.6.26 starts hitting the
distros you may see a flood of complaints -- and I came to LKML partly
with the purpose of providing a bug fix patch (or, less preferably, a
reversion patch) for Debian, my distro of choice.
I am not giving up. I _did_ look at drivers/char/hpet.c as requested,
but since this code did not change before and after 3def3d6d..., I
was not sure what to look for that would be harmful. The same turned out
to be true about the "connection" I found between HPET and the calls
of insert_resource(), though this could actually be affected if my latest
hypothesis pans out. (All of my ideas have failed so far, though, so it
will not surprise me if my new idea fails as well.)
I found another item in arch/x86/kernel/acpi/boot.c -- which I suspect
_is_ a bug -- but which had no effect on my lockup issue when I "fixed"
it. I will let a guru decide if I have found anything important:
===== BEGIN DIFF ==========================
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 2cdc9de..d5a9d9d 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -669,7 +669,7 @@ static int __init acpi_parse_hpet(struct acpi_table_header *table)
memset(hpet_res, 0, sizeof(*hpet_res));
hpet_res->name = (void *)&hpet_res[1];
- hpet_res->flags = IORESOURCE_MEM;
+ hpet_res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
snprintf((char *)hpet_res->name, HPET_RESOURCE_NAME_SIZE, "HPET %u",
hpet_tbl->sequence);
===== END DIFF ==========================
The dynamically-allocated structure that hpet_res points to eventually gets
added to the resource tree by
static __init int hpet_insert_resource(void);
which calls insert_resource(). My thought is that we are supposed to be
marking the memory region as unavailable, so that nothing else will touch
it later, right?
Anyway, what happened in 3def3d6d to cause the regression? I have a new
hypothesis to test, but I'm too tired to continue right now -- so I'll hit
it again tomorrow before I go to work:
Up until now, I have focused on the fact that request_resource() was
replaced by insert_resource(). I did not pay attention to another aspect
of that commit -- the large change in the order of execution of where the
kernel memory regions are added to the resource list.
The original (2.6.25) approach, which works on my machine, is to identify
RAM resources as they are added to the resource list, then tack on the
kernel memory regions to the (proper) resource as it is added to the tree:
for (...) {
[...]
if (e820.map[i].type == E820_RAM) {
request_resource(res, code_resource);
request_resource(res, data_resource);
request_resource(res, bss_resource);
[...]
}
insert_resource(&iomem_resource, res);
}
The problem commit moves those 3 request_resource() calls out of
e820_reserve_resources() [arch/x86/kernel/e820{,_64}.c] and into
setup_arch() [arch/x86/kernel/setup{,_64}.c]. The original code
would have this happen when setup_arch() directly callse
820_reserve_resources(), but the commit moved those lines into
setup_arch() itself, and they run sooner now than before...
potentially affecting any resources added afterward.
I don't see what effect this reordering could possibly have, since
insert_resource() ignores the IORESOURCE_BUSY flag. But that
commit changed SOMETHING... and the two most obvious changes are
the {request,insert}_resource() switch, and the repositioning of
the request_resource() calls for {code,data,bss}_resource.
I did a LOT of testing of insert_resource() last weekend, and it
completely checked out: it was only called about a dozen times,
and it always inserted the resource without returning errors or
accessing code paths for special cases. That function is not
broken internally, though its proper functioning might have
unintended side effects I have not understood yet.
I had the idea of setting up a side-by-side test -- taking the
two versions of e820_reserve_resources() from before and after
3def3d6d, renaming them, and writing a tiny replacement of
e820_reserve_resources() which would call both versions... with
the idea that I could recurse the resulting trees and compare
their contents for differences.
While reading drivers/char/hpet.c and looking at the functions
used there, I discovered request_region(), and realized that it
would be difficult to compare the entire iomem_resource tree to
a dummy tree only containing resources added by insert_resource()
and request_resource(). It might be simpler to have my tiny
e820_reserve_resources() replacement add each resource to 3 trees
-- the real iomem_resource tree, and 2 dummy trees -- which could
then be compared for differences just before the kernel locks up.
Thanks,
Dave W.
/* head hits pillow */
next reply other threads:[~2008-08-20 4:51 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-20 4:51 David Witbrodt [this message]
2008-08-20 5:21 ` HPET regression in 2.6.26 versus 2.6.25 -- found another user with the same regression Yinghai Lu
2008-08-20 7:51 ` Bill Fink
2008-08-20 8:02 ` Yinghai Lu
2008-08-20 9:15 ` Ingo Molnar
2008-08-20 9:31 ` Yinghai Lu
2008-08-20 9:36 ` Ingo Molnar
-- strict thread matches above, loose matches on Subject: below --
2008-08-20 14:08 David Witbrodt
2008-08-20 14:32 David Witbrodt
2008-08-20 14:49 ` Ingo Molnar
2008-08-20 16:44 David Witbrodt
2008-08-20 17:42 David Witbrodt
2008-08-20 17:58 ` Yinghai Lu
2008-08-21 2:02 ` Yinghai Lu
2008-08-21 2:48 David Witbrodt
2008-08-21 4:07 David Witbrodt
2008-08-21 6:42 ` Yinghai Lu
2008-08-21 7:04 ` Ilpo Järvinen
2008-08-21 13:33 David Witbrodt
2008-08-21 14:09 David Witbrodt
2008-08-21 15:33 ` Yinghai Lu
2008-08-21 16:53 David Witbrodt
2008-08-21 17:57 ` Yinghai Lu
2008-08-22 1:24 David Witbrodt
2008-08-23 2:25 David Witbrodt
2008-08-23 5:41 ` Yinghai Lu
2008-08-23 6:56 ` Yinghai Lu
2008-08-23 11:42 David Witbrodt
2008-08-23 11:58 David Witbrodt
2008-08-23 13:36 ` Ingo Molnar
2008-08-23 15:03 ` Ingo Molnar
2008-08-23 17:51 ` Yinghai Lu
2008-08-23 15:42 David Witbrodt
2008-08-23 15:55 ` Ingo Molnar
2008-08-23 16:32 David Witbrodt
2008-08-23 16:44 David Witbrodt
2008-08-23 19:29 David Witbrodt
2008-08-23 20:00 David Witbrodt
2008-08-23 20:13 ` Rufus & Azrael
2008-08-23 23:09 David Witbrodt
2008-08-23 23:11 David Witbrodt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=109483.89278.qm@web82104.mail.mud.yahoo.com \
--to=dawitbro@sbcglobal.net \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=netdev@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=yhlu.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).