* NPTL/TLS problem in -unstable ?
@ 2005-01-10 18:47 Rik van Riel
2005-01-10 18:56 ` Christian Limpach
0 siblings, 1 reply; 10+ messages in thread
From: Rik van Riel @ 2005-01-10 18:47 UTC (permalink / raw)
To: xen-devel
Hi,
when trying to boot with the latest xen-unstable tree,
I can only get the system working when /lib/tls is
moved out of the way.
With it present, Xen hangs after the TLS warning box,
in the big WARNING. It does its 5 seconds of waiting
and then just hangs...
I can't see an suspect code changes in seg_fixup.c, so
it's probably something else...
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NPTL/TLS problem in -unstable ?
2005-01-10 18:47 Rik van Riel
@ 2005-01-10 18:56 ` Christian Limpach
2005-01-10 19:08 ` Rik van Riel
2005-01-10 20:04 ` Rik van Riel
0 siblings, 2 replies; 10+ messages in thread
From: Christian Limpach @ 2005-01-10 18:56 UTC (permalink / raw)
To: Rik van Riel; +Cc: xen-devel
On Mon, Jan 10, 2005 at 01:47:36PM -0500, Rik van Riel wrote:
> when trying to boot with the latest xen-unstable tree,
> I can only get the system working when /lib/tls is
> moved out of the way.
>
> With it present, Xen hangs after the TLS warning box,
> in the big WARNING. It does its 5 seconds of waiting
> and then just hangs...
>
> I can't see an suspect code changes in seg_fixup.c, so
> it's probably something else...
Since you mentioned this in another message yesterday, I
tried booting xen-unstable with /lib/tls in place today
and it worked. Do you have any local changes in your kernel
or a different glibc?
christian
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NPTL/TLS problem in -unstable ?
2005-01-10 18:56 ` Christian Limpach
@ 2005-01-10 19:08 ` Rik van Riel
2005-01-10 20:04 ` Rik van Riel
1 sibling, 0 replies; 10+ messages in thread
From: Rik van Riel @ 2005-01-10 19:08 UTC (permalink / raw)
To: Christian Limpach; +Cc: xen-devel
On Mon, 10 Jan 2005, Christian Limpach wrote:
> Since you mentioned this in another message yesterday, I
> tried booting xen-unstable with /lib/tls in place today
> and it worked. Do you have any local changes in your kernel
> or a different glibc?
Hmmm, I'll try building a stock Xen kernel, not the rawhide one.
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NPTL/TLS problem in -unstable ?
2005-01-10 18:56 ` Christian Limpach
2005-01-10 19:08 ` Rik van Riel
@ 2005-01-10 20:04 ` Rik van Riel
2005-01-10 22:14 ` Rik van Riel
2005-01-11 16:33 ` Christian Limpach
1 sibling, 2 replies; 10+ messages in thread
From: Rik van Riel @ 2005-01-10 20:04 UTC (permalink / raw)
To: Christian Limpach; +Cc: xen-devel
On Mon, 10 Jan 2005, Christian Limpach wrote:
> Since you mentioned this in another message yesterday, I
> tried booting xen-unstable with /lib/tls in place today
> and it worked. Do you have any local changes in your kernel
> or a different glibc?
OK, vanilla linux-2.6.10 as compiled from the xen tree
seems to work, though the first "ldd" after an ldconfig
seems to segfault ;)
# ldconfig
# ldd /usr/bin/gcc
Segmentation fault
# ldd /usr/bin/gcc
linux-gate.so.1 => (0xfbffd000)
libc.so.6 => /lib/tls/libc.so.6 (0x00ae1000)
/lib/ld-linux.so.2 (0x00ac8000)
#
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NPTL/TLS problem in -unstable ?
2005-01-10 20:04 ` Rik van Riel
@ 2005-01-10 22:14 ` Rik van Riel
2005-01-10 23:03 ` Christian Limpach
2005-01-11 16:33 ` Christian Limpach
1 sibling, 1 reply; 10+ messages in thread
From: Rik van Riel @ 2005-01-10 22:14 UTC (permalink / raw)
To: Christian Limpach; +Cc: xen-devel
On Mon, 10 Jan 2005, Rik van Riel wrote:
> On Mon, 10 Jan 2005, Christian Limpach wrote:
>
>> Since you mentioned this in another message yesterday, I
>> tried booting xen-unstable with /lib/tls in place today
>> and it worked. Do you have any local changes in your kernel
>> or a different glibc?
>
> OK, vanilla linux-2.6.10 as compiled from the xen tree
> seems to work, though the first "ldd" after an ldconfig
> seems to segfault ;)
I was wrong. Moving /lib/tls back in place after the system
has started up works, but booting with /lib/tls already there
hangs the system hard...
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NPTL/TLS problem in -unstable ?
2005-01-10 22:14 ` Rik van Riel
@ 2005-01-10 23:03 ` Christian Limpach
2005-01-11 16:03 ` Rik van Riel
0 siblings, 1 reply; 10+ messages in thread
From: Christian Limpach @ 2005-01-10 23:03 UTC (permalink / raw)
To: Rik van Riel; +Cc: xen-devel
On Mon, Jan 10, 2005 at 05:14:28PM -0500, Rik van Riel wrote:
> On Mon, 10 Jan 2005, Rik van Riel wrote:
> >On Mon, 10 Jan 2005, Christian Limpach wrote:
> >
> >>Since you mentioned this in another message yesterday, I
> >>tried booting xen-unstable with /lib/tls in place today
> >>and it worked. Do you have any local changes in your kernel
> >>or a different glibc?
> >
> >OK, vanilla linux-2.6.10 as compiled from the xen tree
> >seems to work, though the first "ldd" after an ldconfig
> >seems to segfault ;)
>
> I was wrong. Moving /lib/tls back in place after the system
> has started up works, but booting with /lib/tls already there
> hangs the system hard...
I've given it another try (since my previous test was with a non-default
kernel config file) but it still works for me (boot with /lib/tls in place).
When it hangs, is Xen still alive, i.e. does pressing ctrl-a 3 times
switch to Xen's console? Can you give some more information, like what
CPU you're using and can you find out what the last version was which
worked? Does 2.0-testing work?
christian
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NPTL/TLS problem in -unstable ?
[not found] <mailman.1105390411.29331@unix-os.sc.intel.com>
@ 2005-01-10 23:47 ` Arun Sharma
0 siblings, 0 replies; 10+ messages in thread
From: Arun Sharma @ 2005-01-10 23:47 UTC (permalink / raw)
To: Rik van Riel; +Cc: xen-devel
On 1/10/2005 10:47 AM, Rik van Riel wrote:
> Hi,
>
> when trying to boot with the latest xen-unstable tree,
> I can only get the system working when /lib/tls is
> moved out of the way.
>
> With it present, Xen hangs after the TLS warning box,
> in the big WARNING. It does its 5 seconds of waiting
> and then just hangs...
>
I also saw this behavior on a Fedora Core 2 box with xen-unstable (2.6.10-xen0).
-Arun
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NPTL/TLS problem in -unstable ?
2005-01-10 23:03 ` Christian Limpach
@ 2005-01-11 16:03 ` Rik van Riel
0 siblings, 0 replies; 10+ messages in thread
From: Rik van Riel @ 2005-01-11 16:03 UTC (permalink / raw)
To: Christian Limpach; +Cc: xen-devel
On Mon, 10 Jan 2005, Christian Limpach wrote:
> I've given it another try (since my previous test was with a non-default
> kernel config file) but it still works for me (boot with /lib/tls in place).
> When it hangs, is Xen still alive, i.e. does pressing ctrl-a 3 times
> switch to Xen's console? Can you give some more information, like what
> CPU you're using and can you find out what the last version was which
> worked? Does 2.0-testing work?
I cannot get to the Xen console, but alt-sysrq-p does
work.
It looks like most of the time the call trace is:
force_evtchn_callback+0xa/0xc
force_sig_info+0x19a/0x1a0
do_page_fault+0x4f6/0x597
[varying stack leftovers]
evtchn_do_upcall+0x7e/0xc0
page_fault+0x3b/0x40
Hmmm, looks like a segmentation fault. Possibly the same
segfault that was hitting the first program run after
switching on NPTL by moving /lib/tls back on an already
running system.
After pressing alt+sysrq+p a few hundred times, I got lucky:
(typed in by hand)
Pid: 1, comm: init
EIP: 0061:[<c0109f00>] CPU: 0
EIP is at page_fault+0x0/0x40
EFLAGS: 00001286 Not tainted (2.6.10-1.175_FC4xen0)
EAX: c02f8558 EBX: c02f8989 ECX: ffffffe0 EDX: 00000002
ESI: 00000000 EDI: 00000006 EBP: bffffe84 DS: 007b ES: 007b
Now, from arch/xen/i386/kernel/entry.S:
# This handler is special, because it gets an extra value on its stack,
# which is the linear faulting address.
# fastcall register usage: %eax = pt_regs, %edx = error code,
# %ecx = fault address
ENTRY(page_fault)
pushl %ds
...
If so, are we faulting on ffffffe0 ? I know the vsyscall
page shouldn't be compatible with Xen (or execshield), so
why is init trying to fault there ?
Why isn't do_page_fault bailing out and trying to kill the
process ?
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NPTL/TLS problem in -unstable ?
2005-01-10 20:04 ` Rik van Riel
2005-01-10 22:14 ` Rik van Riel
@ 2005-01-11 16:33 ` Christian Limpach
2005-01-11 19:21 ` Rik van Riel
1 sibling, 1 reply; 10+ messages in thread
From: Christian Limpach @ 2005-01-11 16:33 UTC (permalink / raw)
To: Rik van Riel; +Cc: xen-devel
On Mon, Jan 10, 2005 at 03:04:50PM -0500, Rik van Riel wrote:
> >Since you mentioned this in another message yesterday, I
> >tried booting xen-unstable with /lib/tls in place today
> >and it worked. Do you have any local changes in your kernel
> >or a different glibc?
>
> OK, vanilla linux-2.6.10 as compiled from the xen tree
> seems to work, though the first "ldd" after an ldconfig
> seems to segfault ;)
I managed to reproduce this earlier this afternoon (FC3 userland
seems to trigger it reliably, while debian userland doesn't) and
it is now fixed in xen-2.0, xen-2.0-testing and xen-unstable.
The problem was introduced during the upgrade to 2.6.10 where
calls from entry.S to C functions got changed to be fastcall
instead of asmlinkage. I missed changing one of the functions
(the seg fixup callback one) and when the compiler turned the
final printk in that function into a tail call, it would clobber
the first register on the stack which then caused the instruction
to be re-executed with a random %ebx.
christian
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: NPTL/TLS problem in -unstable ?
2005-01-11 16:33 ` Christian Limpach
@ 2005-01-11 19:21 ` Rik van Riel
0 siblings, 0 replies; 10+ messages in thread
From: Rik van Riel @ 2005-01-11 19:21 UTC (permalink / raw)
To: Christian Limpach; +Cc: xen-devel
On Tue, 11 Jan 2005, Christian Limpach wrote:
> I managed to reproduce this earlier this afternoon (FC3 userland
> seems to trigger it reliably, while debian userland doesn't) and
> it is now fixed in xen-2.0, xen-2.0-testing and xen-unstable.
Confirmed, things are working again now.
I promise I'll try to follow the xen tree more closely,
so I can find breakages such as this on the day they
happen - that should help make it easier to track them
down.
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-01-11 19:21 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <mailman.1105390411.29331@unix-os.sc.intel.com>
2005-01-10 23:47 ` NPTL/TLS problem in -unstable ? Arun Sharma
2005-01-10 18:47 Rik van Riel
2005-01-10 18:56 ` Christian Limpach
2005-01-10 19:08 ` Rik van Riel
2005-01-10 20:04 ` Rik van Riel
2005-01-10 22:14 ` Rik van Riel
2005-01-10 23:03 ` Christian Limpach
2005-01-11 16:03 ` Rik van Riel
2005-01-11 16:33 ` Christian Limpach
2005-01-11 19:21 ` Rik van Riel
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.