* [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9
@ 2008-12-17 22:46 Helge Deller
2008-12-18 0:05 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling John David Anglin
` (3 more replies)
0 siblings, 4 replies; 56+ messages in thread
From: Helge Deller @ 2008-12-17 22:46 UTC (permalink / raw)
To: linux-parisc
[-- Attachment #1: Type: text/plain, Size: 2371 bytes --]
The Debian bugzilla has a long thread about kernel crashes when
compiling ruby1.9 on hppa. This kernel bug led even to discussions if
hppa should be dropped for lenny.
See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=478717 for details.
It's really easy to reproduce the bug, and it generates this backtrace
(interestingly two backtraces):
< Your System ate a SPARC! Gah! >
-------------------------------
\ ^__^
\ (xx)\_______
(__)\ )\/\
U ||----w |
|| ||
miniruby (pid 15221): Protection id trap (code 27)
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001000000000000001111 Not tainted
r00-03 0004000f 102a9800 101a141c 8210c388
r04-07 00000000 0020fd08 0020fd10 00000001
r08-11 00000000 8210c388 fffffff2 8210c0c8
r12-15 fb0d04c8 402cc3d8 00001000 40007000
r16-19 002120a0 00000010 0020fd90 00000001
r20-23 8210c000 00000000 0020fd0e 8210c39e
r24-27 00000000 00000001 8e7c5660 105e7a90
r28-31 0000000f 00190834 8210c500 101a12b8
sr00-03 00000000 00000000 00000000 00000847
sr04-07 00000000 00000000 00000000 00000000
IASQ: 00000000 00000000 IAOQ: 101a147c 101a1480
IIR: 0ed5d240 ISR: 00000847 IOR: 0020fd0e
CPU: 0 CR30: 8210c000 CR31: d22344f0
ORIG_R28: 00001000
IAOQ[0]: do_sys_poll+0x1ac/0x1b8
IAOQ[1]: do_sys_poll+0x1b0/0x1b8
RP(r2): do_sys_poll+0x14c/0x1b8
Backtrace:
[<101a1574>] sys_poll+0x84/0xec
[<10114078>] syscall_exit+0x0/0x28
Backtrace:
[<1010fdb8>] die_if_kernel+0xe8/0x1ac
[<10110584>] handle_interruption+0x2fc/0x598
[<10113078>] intr_check_sig+0x0/0x34
The bug (sometimes but not always!) happens in fs/select.c:do_sys_poll()
when calling __put_user() and writing back fds[0].revents to userspace.
What I quite don't understand yet is, why does copy_from_user() [called
a few lines above the __put_user()] succeeds, and __put_user() sometimes
suddenly fails with a protection id fault.
The attached patch simply adds the lookup for a fixup handler when trap
#27 (protection id trap) happens in kernel space. This was missing in
the code path for trap #27 which is why the kernel then called
die_if_kernel() and crashed.
Even with this patch ruby1.9 may fail to compile, but at least the
kernel crashes are gone.
Any feedback welcome.
Helge
Signed-off-by: Helge Deller <deller@gmx.de>
[-- Attachment #2: data_protection_id_failure.diff --]
[-- Type: text/x-patch, Size: 1693 bytes --]
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 4c771cd..70eabfe 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -43,6 +43,8 @@
#include "../math-emu/math-emu.h" /* for handle_fpe() */
+DECLARE_PER_CPU(struct exception_data, exception_data);
+
#define PRINT_USER_FAULTS /* (turn this on if you want user faults to be */
/* dumped to the console via printk) */
@@ -745,6 +747,41 @@ void handle_interruption(int code, struct pt_regs *regs)
/* Fall Through */
case 27:
/* Data memory protection ID trap */
+ if (code == 27 && !user_mode(regs)) {
+ const struct exception_table_entry *fix;
+
+ /* mostly copied from:
+ arch/parisc/mm/fault.c:do_page_fault()
+ */
+ fix = search_exception_tables(regs->iaoq[0]);
+ printk(KERN_CRIT "BUG: Kernel Data memory protection ID"
+ " trap at %p (%pS), fix=%p\n",
+ (void*)regs->iaoq[0], (void*)regs->iaoq[0], fix);
+ if (fix) {
+ struct exception_data *d;
+
+ d = &__get_cpu_var(exception_data);
+ d->fault_ip = regs->iaoq[0];
+ d->fault_space = regs->isr;
+ d->fault_addr = regs->ior;
+
+ regs->iaoq[0] = ((fix->fixup) & ~3);
+
+ /*
+ * NOTE: In some cases the faulting instruction
+ * may be in the delay slot of a branch. We
+ * don't want to take the branch, so we don't
+ * increment iaoq[1], instead we set it to be
+ * iaoq[0]+4, and clear the B bit in the PSW
+ */
+
+ regs->iaoq[1] = regs->iaoq[0] + 4;
+ regs->gr[0] &= ~PSW_B; /* IPSW in gr[0] */
+
+ return;
+ }
+ }
+
die_if_kernel("Protection id trap", regs, code);
si.si_code = SEGV_MAPERR;
si.si_signo = SIGSEGV;
^ permalink raw reply related [flat|nested] 56+ messages in thread* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling 2008-12-17 22:46 [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Helge Deller @ 2008-12-18 0:05 ` John David Anglin 2008-12-18 0:43 ` Kyle McMartin 2008-12-18 1:46 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Carlos O'Donell ` (2 subsequent siblings) 3 siblings, 1 reply; 56+ messages in thread From: John David Anglin @ 2008-12-18 0:05 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc > r20-23 8210c000 00000000 0020fd0e 8210c39e > r24-27 00000000 00000001 8e7c5660 105e7a90 > r28-31 0000000f 00190834 8210c500 101a12b8 > sr00-03 00000000 00000000 00000000 00000847 > sr04-07 00000000 00000000 00000000 00000000 > > IASQ: 00000000 00000000 IAOQ: 101a147c 101a1480 > IIR: 0ed5d240 ISR: 00000847 IOR: 0020fd0e > CPU: 0 CR30: 8210c000 CR31: d22344f0 > ORIG_R28: 00001000 > IAOQ[0]: do_sys_poll+0x1ac/0x1b8 > IAOQ[1]: do_sys_poll+0x1b0/0x1b8 > RP(r2): do_sys_poll+0x14c/0x1b8 The faulting instruction is: dave@hiauly6:~/opt/gnu/bin$ disasm 0x0ed5d240 0: 0e d5 d2 40 sth r21,0(sr3,r22) r22 is 0. r22 being 0 is a bit wierd as revents has on offset of 6. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling 2008-12-18 0:05 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling John David Anglin @ 2008-12-18 0:43 ` Kyle McMartin 2008-12-18 3:02 ` [PATCH] [RFC] fix kernel crash (protection id trap) when John David Anglin 0 siblings, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-18 0:43 UTC (permalink / raw) To: John David Anglin; +Cc: Helge Deller, linux-parisc On Wed, Dec 17, 2008 at 07:05:26PM -0500, John David Anglin wrote: > > r20-23 8210c000 00000000 0020fd0e 8210c39e > > r24-27 00000000 00000001 8e7c5660 105e7a90 > > r28-31 0000000f 00190834 8210c500 101a12b8 > > sr00-03 00000000 00000000 00000000 00000847 > > sr04-07 00000000 00000000 00000000 00000000 > > > > IASQ: 00000000 00000000 IAOQ: 101a147c 101a1480 > > IIR: 0ed5d240 ISR: 00000847 IOR: 0020fd0e > > CPU: 0 CR30: 8210c000 CR31: d22344f0 > > ORIG_R28: 00001000 > > IAOQ[0]: do_sys_poll+0x1ac/0x1b8 > > IAOQ[1]: do_sys_poll+0x1b0/0x1b8 > > RP(r2): do_sys_poll+0x14c/0x1b8 > > The faulting instruction is: > > dave@hiauly6:~/opt/gnu/bin$ disasm 0x0ed5d240 > 0: 0e d5 d2 40 sth r21,0(sr3,r22) > > r22 is 0. r22 being 0 is a bit wierd as revents has on offset of 6. > Eh? r22 is 0020fd0e... although it's easier to spot by looking at IOR > > IIR: 0ed5d240 ISR: 00000847 IOR: 0020fd0e Weird. Although, I don't think we're supposed to ever be taking a P. Id trap on a userspace process as the kernel... ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-18 0:43 ` Kyle McMartin @ 2008-12-18 3:02 ` John David Anglin 2008-12-18 3:05 ` Kyle McMartin 2008-12-18 4:04 ` Kyle McMartin 0 siblings, 2 replies; 56+ messages in thread From: John David Anglin @ 2008-12-18 3:02 UTC (permalink / raw) To: Kyle McMartin; +Cc: deller, linux-parisc > Eh? r22 is 0020fd0e... although it's easier to spot by looking at IOR > > > IIR: 0ed5d240 ISR: 00000847 IOR: 0020fd0e Obviously, I can't count... > Although, I don't think we're supposed to ever be taking a P. Id trap on > a userspace process as the kernel... Ok, should P and/or D be on? Assuming the protection id isn't wrong, then the WD bit must be on. A problem with the WD bit might explain Helge's comment. As a start, dumping CR 8, 9, 12 and 13 might help debug the problem. There might be a problem restoring access IDs after a cache purge or TLB insertion. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-18 3:02 ` [PATCH] [RFC] fix kernel crash (protection id trap) when John David Anglin @ 2008-12-18 3:05 ` Kyle McMartin 2008-12-18 4:04 ` Kyle McMartin 1 sibling, 0 replies; 56+ messages in thread From: Kyle McMartin @ 2008-12-18 3:05 UTC (permalink / raw) To: John David Anglin; +Cc: deller, linux-parisc On Wed, Dec 17, 2008 at 10:02:13PM -0500, John David Anglin wrote: > > Although, I don't think we're supposed to ever be taking a P. Id trap on > > a userspace process as the kernel... > > Ok, should P and/or D be on? > > Assuming the protection id isn't wrong, then the WD bit must be on. > A problem with the WD bit might explain Helge's comment. > > As a start, dumping CR 8, 9, 12 and 13 might help debug the problem. > > There might be a problem restoring access IDs after a cache purge or > TLB insertion. > I'm starting a build of ruby now, with some debugging gunk added to dump the pid %crs, we'll see how it goes in a few hours... re, Kyle ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-18 3:02 ` [PATCH] [RFC] fix kernel crash (protection id trap) when John David Anglin 2008-12-18 3:05 ` Kyle McMartin @ 2008-12-18 4:04 ` Kyle McMartin 2008-12-18 16:16 ` Carlos O'Donell 2008-12-18 18:28 ` Kyle McMartin 1 sibling, 2 replies; 56+ messages in thread From: Kyle McMartin @ 2008-12-18 4:04 UTC (permalink / raw) To: John David Anglin; +Cc: Kyle McMartin, deller, linux-parisc On Wed, Dec 17, 2008 at 10:02:13PM -0500, John David Anglin wrote: > > Eh? r22 is 0020fd0e... although it's easier to spot by looking at IOR > > > > IIR: 0ed5d240 ISR: 00000847 IOR: 0020fd0e > > Obviously, I can't count... > > > Although, I don't think we're supposed to ever be taking a P. Id trap on > > a userspace process as the kernel... > > Ok, should P and/or D be on? > Huh... Possibly the P-bit shouldn't be on for kernel mode... the kernel after all can scribble on whomevers pages it so chooses... > Assuming the protection id isn't wrong, then the WD bit must be on. > A problem with the WD bit might explain Helge's comment. > > As a start, dumping CR 8, 9, 12 and 13 might help debug the problem. > Building a patch to do this now, will attach it here if it builds. > There might be a problem restoring access IDs after a cache purge or > TLB insertion. > I honestly have no idea how the protection id registers are loaded... I thought it had something do with with space registers, but I can't remember off the top of my head. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-18 4:04 ` Kyle McMartin @ 2008-12-18 16:16 ` Carlos O'Donell 2008-12-18 18:28 ` Kyle McMartin 1 sibling, 0 replies; 56+ messages in thread From: Carlos O'Donell @ 2008-12-18 16:16 UTC (permalink / raw) To: Kyle McMartin; +Cc: John David Anglin, deller, linux-parisc On Wed, Dec 17, 2008 at 11:04 PM, Kyle McMartin <kyle@infradead.org> wrote: > On Wed, Dec 17, 2008 at 10:02:13PM -0500, John David Anglin wrote: >> > Eh? r22 is 0020fd0e... although it's easier to spot by looking at IOR >> > > > IIR: 0ed5d240 ISR: 00000847 IOR: 0020fd0e >> >> Obviously, I can't count... >> >> > Although, I don't think we're supposed to ever be taking a P. Id trap on >> > a userspace process as the kernel... >> >> Ok, should P and/or D be on? >> > > Huh... Possibly the P-bit shouldn't be on for kernel mode... the kernel > after all can scribble on whomevers pages it so chooses... IMO the kernel should run with the P-Bit on. Disabling the P-Bit could potentially mask stale TLB entry bugs. Imagine this scenario: 1. Context switch to process A, setup protection ids, setup TLB entries. 2. Context switch to process B, setup protection ids, incorrectly setup TLB entries (entries from process A remain). 3. Kernel tries to access process B's page but triggers TLB entry for process A and gets a data protection id trap. The trap in #3 is correct since we used a stale TLB entry. What I don't know is this: Are all of process A's tlb entries purged at context switch? Do they have to be? Do we implement lazy tlb flushing? Do process A's tlb entries only get flushed when they trigger a data protection id trap? >> There might be a problem restoring access IDs after a cache purge or >> TLB insertion. >> > > I honestly have no idea how the protection id registers are loaded... I > thought it had something do with with space registers, but I can't > remember off the top of my head. I would imagine protection id registers are loaded at every context switch? Do threads have the same protection id register values (and the same space register) such that they can access the same pages? c. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-18 4:04 ` Kyle McMartin 2008-12-18 16:16 ` Carlos O'Donell @ 2008-12-18 18:28 ` Kyle McMartin 2008-12-19 15:25 ` Carlos O'Donell 1 sibling, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-18 18:28 UTC (permalink / raw) To: Kyle McMartin; +Cc: John David Anglin, deller, linux-parisc On Wed, Dec 17, 2008 at 11:04:00PM -0500, Kyle McMartin wrote: > > There might be a problem restoring access IDs after a cache purge or > > TLB insertion. > > > > I honestly have no idea how the protection id registers are loaded... I > thought it had something do with with space registers, but I can't > remember off the top of my head. For future reference, in mmu_context.h. Bloody tricky mtctl() macro doesn't take a prefix, just the numeric cr #. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-18 18:28 ` Kyle McMartin @ 2008-12-19 15:25 ` Carlos O'Donell 2008-12-19 16:13 ` John David Anglin 0 siblings, 1 reply; 56+ messages in thread From: Carlos O'Donell @ 2008-12-19 15:25 UTC (permalink / raw) To: Kyle McMartin; +Cc: John David Anglin, deller, linux-parisc On Thu, Dec 18, 2008 at 1:28 PM, Kyle McMartin <kyle@infradead.org> wrote: > On Wed, Dec 17, 2008 at 11:04:00PM -0500, Kyle McMartin wrote: >> > There might be a problem restoring access IDs after a cache purge or >> > TLB insertion. >> > >> >> I honestly have no idea how the protection id registers are loaded... I >> thought it had something do with with space registers, but I can't >> remember off the top of my head. > > For future reference, in mmu_context.h. Bloody tricky mtctl() macro > doesn't take a prefix, just the numeric cr #. Evil. Helge, Can you cleanup your patch and resubmit. I think your patch is a good belt-and-suspenders patch. The kernel should not fault on a protection id trap taken while writing to a userspace address (even if such a scenario indicates a possible bug in the mm context switching code). Cheers, Carlos. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 15:25 ` Carlos O'Donell @ 2008-12-19 16:13 ` John David Anglin 2008-12-19 16:23 ` Carlos O'Donell 0 siblings, 1 reply; 56+ messages in thread From: John David Anglin @ 2008-12-19 16:13 UTC (permalink / raw) To: Carlos O'Donell; +Cc: kyle, deller, linux-parisc > > For future reference, in mmu_context.h. Bloody tricky mtctl() macro > > doesn't take a prefix, just the numeric cr #. > > Evil. If this is a problem with the WD bit, the problem is likely here: mtctl(context >> (SPACEID_SHIFT - 1),8); It's possible this might set the WD bit depending on the alignment of context and the value of SPACEID_SHIFT on your machine. If you can duplicate, maybe add a BUG_ON. On hppa64, all structs are passed by value, so the alignment of context may only be BIGGEST_ALIGNMENT. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 16:13 ` John David Anglin @ 2008-12-19 16:23 ` Carlos O'Donell 2008-12-19 16:28 ` Kyle McMartin 0 siblings, 1 reply; 56+ messages in thread From: Carlos O'Donell @ 2008-12-19 16:23 UTC (permalink / raw) To: John David Anglin; +Cc: kyle, deller, linux-parisc On Fri, Dec 19, 2008 at 11:13 AM, John David Anglin <dave@hiauly1.hia.nrc.ca> wrote: >> > For future reference, in mmu_context.h. Bloody tricky mtctl() macro >> > doesn't take a prefix, just the numeric cr #. >> >> Evil. > > If this is a problem with the WD bit, the problem is likely here: > > mtctl(context >> (SPACEID_SHIFT - 1),8); > > It's possible this might set the WD bit depending on the alignment > of context and the value of SPACEID_SHIFT on your machine. If you > can duplicate, maybe add a BUG_ON. On hppa64, all structs are passed > by value, so the alignment of context may only be BIGGEST_ALIGNMENT. The value of context is a space id (the value returned by alloc_sid()) and is not related to any alignment? Cheers, Carlos. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 16:23 ` Carlos O'Donell @ 2008-12-19 16:28 ` Kyle McMartin 2008-12-19 16:35 ` Carlos O'Donell 0 siblings, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-19 16:28 UTC (permalink / raw) To: Carlos O'Donell; +Cc: John David Anglin, kyle, deller, linux-parisc On Fri, Dec 19, 2008 at 11:23:39AM -0500, Carlos O'Donell wrote: > On Fri, Dec 19, 2008 at 11:13 AM, John David Anglin > <dave@hiauly1.hia.nrc.ca> wrote: > >> > For future reference, in mmu_context.h. Bloody tricky mtctl() macro > >> > doesn't take a prefix, just the numeric cr #. > >> > >> Evil. > > > > If this is a problem with the WD bit, the problem is likely here: > > > > mtctl(context >> (SPACEID_SHIFT - 1),8); > > > > It's possible this might set the WD bit depending on the alignment > > of context and the value of SPACEID_SHIFT on your machine. If you > > can duplicate, maybe add a BUG_ON. On hppa64, all structs are passed > > by value, so the alignment of context may only be BIGGEST_ALIGNMENT. > > The value of context is a space id (the value returned by alloc_sid()) > and is not related to any alignment? It's a typedef to an unsigned long. I'm not sure we've seen this on a 64-bit kernel (which is the SPACEID_SHIFT one.) On a 32-bit kernel (where I've been able to reproduce this) we do a leftshift by one, so the WD bit is guaranteed to be 0. cheers, Kyle ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 16:28 ` Kyle McMartin @ 2008-12-19 16:35 ` Carlos O'Donell 2008-12-19 16:36 ` Carlos O'Donell 0 siblings, 1 reply; 56+ messages in thread From: Carlos O'Donell @ 2008-12-19 16:35 UTC (permalink / raw) To: Kyle McMartin; +Cc: John David Anglin, deller, linux-parisc On Fri, Dec 19, 2008 at 11:28 AM, Kyle McMartin <kyle@infradead.org> wrote: > On Fri, Dec 19, 2008 at 11:23:39AM -0500, Carlos O'Donell wrote: >> On Fri, Dec 19, 2008 at 11:13 AM, John David Anglin >> <dave@hiauly1.hia.nrc.ca> wrote: >> >> > For future reference, in mmu_context.h. Bloody tricky mtctl() macro >> >> > doesn't take a prefix, just the numeric cr #. >> >> >> >> Evil. >> > >> > If this is a problem with the WD bit, the problem is likely here: >> > >> > mtctl(context >> (SPACEID_SHIFT - 1),8); >> > >> > It's possible this might set the WD bit depending on the alignment >> > of context and the value of SPACEID_SHIFT on your machine. If you >> > can duplicate, maybe add a BUG_ON. On hppa64, all structs are passed >> > by value, so the alignment of context may only be BIGGEST_ALIGNMENT. >> >> The value of context is a space id (the value returned by alloc_sid()) >> and is not related to any alignment? > > It's a typedef to an unsigned long. I'm not sure we've seen this on a > 64-bit kernel (which is the SPACEID_SHIFT one.) > > On a 32-bit kernel (where I've been able to reproduce this) we do a > leftshift by one, so the WD bit is guaranteed to be 0. John has a good point though, we might as well assert that? c. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 16:35 ` Carlos O'Donell @ 2008-12-19 16:36 ` Carlos O'Donell 2008-12-19 16:44 ` Kyle McMartin 0 siblings, 1 reply; 56+ messages in thread From: Carlos O'Donell @ 2008-12-19 16:36 UTC (permalink / raw) To: Kyle McMartin; +Cc: John David Anglin, deller, linux-parisc On Fri, Dec 19, 2008 at 11:35 AM, Carlos O'Donell <carlos@systemhalted.org> wrote: > On Fri, Dec 19, 2008 at 11:28 AM, Kyle McMartin <kyle@infradead.org> wrote: >> On Fri, Dec 19, 2008 at 11:23:39AM -0500, Carlos O'Donell wrote: >>> On Fri, Dec 19, 2008 at 11:13 AM, John David Anglin >>> <dave@hiauly1.hia.nrc.ca> wrote: >>> >> > For future reference, in mmu_context.h. Bloody tricky mtctl() macro >>> >> > doesn't take a prefix, just the numeric cr #. >>> >> >>> >> Evil. >>> > >>> > If this is a problem with the WD bit, the problem is likely here: >>> > >>> > mtctl(context >> (SPACEID_SHIFT - 1),8); >>> > >>> > It's possible this might set the WD bit depending on the alignment >>> > of context and the value of SPACEID_SHIFT on your machine. If you >>> > can duplicate, maybe add a BUG_ON. On hppa64, all structs are passed >>> > by value, so the alignment of context may only be BIGGEST_ALIGNMENT. >>> >>> The value of context is a space id (the value returned by alloc_sid()) >>> and is not related to any alignment? >> >> It's a typedef to an unsigned long. I'm not sure we've seen this on a >> 64-bit kernel (which is the SPACEID_SHIFT one.) >> >> On a 32-bit kernel (where I've been able to reproduce this) we do a >> leftshift by one, so the WD bit is guaranteed to be 0. > > John has a good point though, we might as well assert that? In alloc_sid() when we allocate the space id? c. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 16:36 ` Carlos O'Donell @ 2008-12-19 16:44 ` Kyle McMartin 2008-12-19 17:28 ` Carlos O'Donell 0 siblings, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-19 16:44 UTC (permalink / raw) To: Carlos O'Donell Cc: Kyle McMartin, John David Anglin, deller, linux-parisc On Fri, Dec 19, 2008 at 11:36:33AM -0500, Carlos O'Donell wrote: > On Fri, Dec 19, 2008 at 11:35 AM, Carlos O'Donell > <carlos@systemhalted.org> wrote: > > On Fri, Dec 19, 2008 at 11:28 AM, Kyle McMartin <kyle@infradead.org> wrote: > >> On Fri, Dec 19, 2008 at 11:23:39AM -0500, Carlos O'Donell wrote: > >>> On Fri, Dec 19, 2008 at 11:13 AM, John David Anglin > >>> <dave@hiauly1.hia.nrc.ca> wrote: > >>> >> > For future reference, in mmu_context.h. Bloody tricky mtctl() macro > >>> >> > doesn't take a prefix, just the numeric cr #. > >>> >> > >>> >> Evil. > >>> > > >>> > If this is a problem with the WD bit, the problem is likely here: > >>> > > >>> > mtctl(context >> (SPACEID_SHIFT - 1),8); > >>> > > >>> > It's possible this might set the WD bit depending on the alignment > >>> > of context and the value of SPACEID_SHIFT on your machine. If you > >>> > can duplicate, maybe add a BUG_ON. On hppa64, all structs are passed > >>> > by value, so the alignment of context may only be BIGGEST_ALIGNMENT. > >>> > >>> The value of context is a space id (the value returned by alloc_sid()) > >>> and is not related to any alignment? > >> > >> It's a typedef to an unsigned long. I'm not sure we've seen this on a > >> 64-bit kernel (which is the SPACEID_SHIFT one.) > >> > >> On a 32-bit kernel (where I've been able to reproduce this) we do a > >> leftshift by one, so the WD bit is guaranteed to be 0. > > > > John has a good point though, we might as well assert that? > > In alloc_sid() when we allocate the space id? > Meh. > >>> > mtctl(context >> (SPACEID_SHIFT - 1),8); SPACEID_SHIFT is the number of spaceid bits, which is 32 on 64-bit. I suspect what this is trying to do, is put it in the rightmost half of the register, so as long as context is shifted left already, (which it is.) When you cancel out the SPACEID_SHIFT in alloc_sid, and here in load_context, you end up with the same mtctl(context << 1, 8) as on 32-bit. re, Kyle ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 16:44 ` Kyle McMartin @ 2008-12-19 17:28 ` Carlos O'Donell 2008-12-19 17:33 ` Kyle McMartin 0 siblings, 1 reply; 56+ messages in thread From: Carlos O'Donell @ 2008-12-19 17:28 UTC (permalink / raw) To: Kyle McMartin; +Cc: John David Anglin, deller, linux-parisc On Fri, Dec 19, 2008 at 11:44 AM, Kyle McMartin <kyle@infradead.org> wrote: > SPACEID_SHIFT is the number of spaceid bits, which is 32 on 64-bit. On 64-bit SPACEID_SHIFT is 11 bits. Verified by hand and by gcc -E/-dM and test program. On 32-bit SPACEID_SHIFT is 0 bits. Verified by hand. > I suspect what this is trying to do, is put it in the rightmost half of > the register, so as long as context is shifted left already, (which it > is.) > > When you cancel out the SPACEID_SHIFT in alloc_sid, and here in > load_context, you end up with the same mtctl(context << 1, 8) as on > 32-bit. Yes, and that keeps the WD bit clear. Cheers, Carlos. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 17:28 ` Carlos O'Donell @ 2008-12-19 17:33 ` Kyle McMartin 2008-12-19 17:36 ` Carlos O'Donell 0 siblings, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-19 17:33 UTC (permalink / raw) To: Carlos O'Donell Cc: Kyle McMartin, John David Anglin, deller, linux-parisc On Fri, Dec 19, 2008 at 12:28:40PM -0500, Carlos O'Donell wrote: > On Fri, Dec 19, 2008 at 11:44 AM, Kyle McMartin <kyle@infradead.org> wrote: > > SPACEID_SHIFT is the number of spaceid bits, which is 32 on 64-bit. > > On 64-bit SPACEID_SHIFT is 11 bits. Verified by hand and by gcc -E/-dM > and test program. > > On 32-bit SPACEID_SHIFT is 0 bits. Verified by hand. > I just assumed MAX_ADDRBITS was 64. Anyway the point is moot. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 17:33 ` Kyle McMartin @ 2008-12-19 17:36 ` Carlos O'Donell 2008-12-19 17:39 ` Kyle McMartin 0 siblings, 1 reply; 56+ messages in thread From: Carlos O'Donell @ 2008-12-19 17:36 UTC (permalink / raw) To: Kyle McMartin; +Cc: John David Anglin, deller, linux-parisc On Fri, Dec 19, 2008 at 12:33 PM, Kyle McMartin <kyle@infradead.org> wrote: > On Fri, Dec 19, 2008 at 12:28:40PM -0500, Carlos O'Donell wrote: >> On Fri, Dec 19, 2008 at 11:44 AM, Kyle McMartin <kyle@infradead.org> wrote: >> > SPACEID_SHIFT is the number of spaceid bits, which is 32 on 64-bit. >> >> On 64-bit SPACEID_SHIFT is 11 bits. Verified by hand and by gcc -E/-dM >> and test program. >> >> On 32-bit SPACEID_SHIFT is 0 bits. Verified by hand. >> > > I just assumed MAX_ADDRBITS was 64. Anyway the point is moot. > I was just crossing the T's and dotting the I's. In case we need to remember what SPACEID_SHIFT value was if we see something sketchy. On 64-bit if you are not using 4kb pages, then SPACEID_SHIFT is 0 bits like a 32-bit kernel. c. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 17:36 ` Carlos O'Donell @ 2008-12-19 17:39 ` Kyle McMartin 2008-12-19 17:42 ` Kyle McMartin 0 siblings, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-19 17:39 UTC (permalink / raw) To: Carlos O'Donell Cc: Kyle McMartin, John David Anglin, deller, linux-parisc On Fri, Dec 19, 2008 at 12:36:15PM -0500, Carlos O'Donell wrote: > On Fri, Dec 19, 2008 at 12:33 PM, Kyle McMartin <kyle@infradead.org> wrote: > > On Fri, Dec 19, 2008 at 12:28:40PM -0500, Carlos O'Donell wrote: > >> On Fri, Dec 19, 2008 at 11:44 AM, Kyle McMartin <kyle@infradead.org> wrote: > >> > SPACEID_SHIFT is the number of spaceid bits, which is 32 on 64-bit. > >> > >> On 64-bit SPACEID_SHIFT is 11 bits. Verified by hand and by gcc -E/-dM > >> and test program. > >> > >> On 32-bit SPACEID_SHIFT is 0 bits. Verified by hand. > >> > > > > I just assumed MAX_ADDRBITS was 64. Anyway the point is moot. > > > > I was just crossing the T's and dotting the I's. In case we need to > remember what SPACEID_SHIFT value was if we see something sketchy. > > On 64-bit if you are not using 4kb pages, then SPACEID_SHIFT is 0 bits > like a 32-bit kernel. > That's a bug, and explains one of the reasons why !4K pages are broken. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 17:39 ` Kyle McMartin @ 2008-12-19 17:42 ` Kyle McMartin 2008-12-19 18:43 ` Carlos O'Donell 0 siblings, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-19 17:42 UTC (permalink / raw) To: Kyle McMartin Cc: Carlos O'Donell, John David Anglin, deller, linux-parisc On Fri, Dec 19, 2008 at 12:39:31PM -0500, Kyle McMartin wrote: > On Fri, Dec 19, 2008 at 12:36:15PM -0500, Carlos O'Donell wrote: > > On Fri, Dec 19, 2008 at 12:33 PM, Kyle McMartin <kyle@infradead.org> wrote: > > > On Fri, Dec 19, 2008 at 12:28:40PM -0500, Carlos O'Donell wrote: > > >> On Fri, Dec 19, 2008 at 11:44 AM, Kyle McMartin <kyle@infradead.org> wrote: > > >> > SPACEID_SHIFT is the number of spaceid bits, which is 32 on 64-bit. > > >> > > >> On 64-bit SPACEID_SHIFT is 11 bits. Verified by hand and by gcc -E/-dM > > >> and test program. > > >> > > >> On 32-bit SPACEID_SHIFT is 0 bits. Verified by hand. > > >> > > > > > > I just assumed MAX_ADDRBITS was 64. Anyway the point is moot. > > > > > > > I was just crossing the T's and dotting the I's. In case we need to > > remember what SPACEID_SHIFT value was if we see something sketchy. > > > > On 64-bit if you are not using 4kb pages, then SPACEID_SHIFT is 0 bits > > like a 32-bit kernel. > > > > That's a bug, and explains one of the reasons why !4K pages are broken. > Or rather not, my thinking being that context << 1 might end up being 33-bits, but I forgot we clip spaceids to the width of the protection registers, so this can't happen. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 17:42 ` Kyle McMartin @ 2008-12-19 18:43 ` Carlos O'Donell 0 siblings, 0 replies; 56+ messages in thread From: Carlos O'Donell @ 2008-12-19 18:43 UTC (permalink / raw) To: Kyle McMartin; +Cc: John David Anglin, deller, linux-parisc On Fri, Dec 19, 2008 at 12:42 PM, Kyle McMartin <kyle@infradead.org> wrote: > Or rather not, my thinking being that context << 1 might end up being > 33-bits, but I forgot we clip spaceids to the width of the protection > registers, so this can't happen. Yes, that is correct, we only use 17-bit space ids on PA 2.0 (18-bits implemented) and 14-bit space ids on PA 1.1 (15-bits implemented on average). I assume we reserve the last bit for the rightmost WD-bit. Therefore context << 1 yields at most 18-bits. c. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-17 22:46 [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Helge Deller 2008-12-18 0:05 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling John David Anglin @ 2008-12-18 1:46 ` Carlos O'Donell 2008-12-18 2:02 ` Carlos O'Donell 2008-12-18 2:36 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Kyle McMartin 2008-12-18 7:03 ` Kyle McMartin 2008-12-19 21:29 ` Kyle McMartin 3 siblings, 2 replies; 56+ messages in thread From: Carlos O'Donell @ 2008-12-18 1:46 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc On Wed, Dec 17, 2008 at 5:46 PM, Helge Deller <deller@gmx.de> wrote: > The attached patch simply adds the lookup for a fixup handler when trap > #27 (protection id trap) happens in kernel space. This was missing in > the code path for trap #27 which is why the kernel then called > die_if_kernel() and crashed. > > Even with this patch ruby1.9 may fail to compile, but at least the > kernel crashes are gone. > > Any feedback welcome. Can we avoid copying code from do_page_fault and just call do_page_fault? Shouldn't a data protection id trap on a userspace address cause do_page_fault to be called which triggers the lookup and transfer to the exception table location? Why aren't we doing this in the data protection id trap case? I remember testing this for the compare-and-swap kernel LWS, in particular calling with an invalid address, where the kernel must load a userspace address, possibly fault, and jump to the exception location to return EFAULT as the error for the compare-and-swap. Cheers, Carlos. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-18 1:46 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Carlos O'Donell @ 2008-12-18 2:02 ` Carlos O'Donell 2008-12-18 13:13 ` Helge Deller 2008-12-18 2:36 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Kyle McMartin 1 sibling, 1 reply; 56+ messages in thread From: Carlos O'Donell @ 2008-12-18 2:02 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc On Wed, Dec 17, 2008 at 8:46 PM, Carlos O'Donell <carlos@systemhalted.org> wrote: > On Wed, Dec 17, 2008 at 5:46 PM, Helge Deller <deller@gmx.de> wrote: >> The attached patch simply adds the lookup for a fixup handler when trap >> #27 (protection id trap) happens in kernel space. This was missing in >> the code path for trap #27 which is why the kernel then called >> die_if_kernel() and crashed. >> >> Even with this patch ruby1.9 may fail to compile, but at least the >> kernel crashes are gone. >> >> Any feedback welcome. > > Can we avoid copying code from do_page_fault and just call do_page_fault? > > Shouldn't a data protection id trap on a userspace address cause > do_page_fault to be called which triggers the lookup and transfer to > the exception table location? > > Why aren't we doing this in the data protection id trap case? I went all the way back to 2.6.0 and the code is still the same. We must never have generated a protection id trap against user addresses while in the kernel, otherwise we would have noticed this sooner. I think your code is correct and necessary. However! I think you should not duplicate code. Please split out this "bit" of code into a function in mm/fault.c and call it from handle_interruption. Cheers, Carlos. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-18 2:02 ` Carlos O'Donell @ 2008-12-18 13:13 ` Helge Deller 2008-12-18 15:28 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling John David Anglin 0 siblings, 1 reply; 56+ messages in thread From: Helge Deller @ 2008-12-18 13:13 UTC (permalink / raw) To: Carlos O'Donell; +Cc: linux-parisc > On Wed, Dec 17, 2008 at 8:46 PM, Carlos O'Donell > > Can we avoid copying code from do_page_fault and just call > do_page_fault? =2E.. > However! I think you should not duplicate code. Please split out this > "bit" of code into a function in mm/fault.c and call it from > handle_interruption. Sure. This patch was just meant to be short and easy understandable for= review of the problem itself and without other noise like splitting ou= t a function... Helge --=20 Sensationsangebot verl=E4ngert: GMX FreeDSL - Telefonanschluss + DSL=20 f=FCr nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=3DOM.AD.PD003K1308T45= 69a -- To unsubscribe from this list: send the line "unsubscribe linux-parisc"= in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling 2008-12-18 13:13 ` Helge Deller @ 2008-12-18 15:28 ` John David Anglin 2008-12-18 16:09 ` Carlos O'Donell 0 siblings, 1 reply; 56+ messages in thread From: John David Anglin @ 2008-12-18 15:28 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, carlos > > On Wed, Dec 17, 2008 at 8:46 PM, Carlos O'Donell > > > Can we avoid copying code from do_page_fault and just call > > do_page_fault? > ... > > However! I think you should not duplicate code. Please split out this > > "bit" of code into a function in mm/fault.c and call it from > > handle_interruption. > > Sure. This patch was just meant to be short and easy understandable for review of the problem itself and without other noise like splitting out a function... >From the documentation, it appears to me that a protection id trap should be treated differently from a data TLB miss/data page fault. The latter can potentially be fixed with TLB insertion. I don't think a protection ID trap can be fixed... Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling 2008-12-18 15:28 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling John David Anglin @ 2008-12-18 16:09 ` Carlos O'Donell 0 siblings, 0 replies; 56+ messages in thread From: Carlos O'Donell @ 2008-12-18 16:09 UTC (permalink / raw) To: John David Anglin; +Cc: Helge Deller, linux-parisc On Thu, Dec 18, 2008 at 10:28 AM, John David Anglin <dave@hiauly1.hia.nrc.ca> wrote: > From the documentation, it appears to me that a protection id trap should > be treated differently from a data TLB miss/data page fault. The latter can > potentially be fixed with TLB insertion. I don't think a protection ID trap > can be fixed... Yes, I think that is also true. What if, after a context switch we have stale TLB entry with an access ID from another process? When the kernel tries to write to the user address the current protection ids don't match the page's access id and we generate a protection id trap. Cheers, Carlos. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-18 1:46 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Carlos O'Donell 2008-12-18 2:02 ` Carlos O'Donell @ 2008-12-18 2:36 ` Kyle McMartin 2008-12-18 12:21 ` Carlos O'Donell 1 sibling, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-18 2:36 UTC (permalink / raw) To: Carlos O'Donell; +Cc: Helge Deller, linux-parisc On Wed, Dec 17, 2008 at 08:46:30PM -0500, Carlos O'Donell wrote: > On Wed, Dec 17, 2008 at 5:46 PM, Helge Deller <deller@gmx.de> wrote: > > The attached patch simply adds the lookup for a fixup handler when trap > > #27 (protection id trap) happens in kernel space. This was missing in > > the code path for trap #27 which is why the kernel then called > > die_if_kernel() and crashed. > > > > Even with this patch ruby1.9 may fail to compile, but at least the > > kernel crashes are gone. > > > > Any feedback welcome. > > Can we avoid copying code from do_page_fault and just call do_page_fault? > > Shouldn't a data protection id trap on a userspace address cause > do_page_fault to be called which triggers the lookup and transfer to > the exception table location? > Yes, it could just fall through in this case, but I don't think that's necessarily correct either. > Why aren't we doing this in the data protection id trap case? > > I remember testing this for the compare-and-swap kernel LWS, in > particular calling with an invalid address, where the kernel must load > a userspace address, possibly fault, and jump to the exception > location to return EFAULT as the error for the compare-and-swap. > Indeed, there's something more sinister going on. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-18 2:36 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Kyle McMartin @ 2008-12-18 12:21 ` Carlos O'Donell 0 siblings, 0 replies; 56+ messages in thread From: Carlos O'Donell @ 2008-12-18 12:21 UTC (permalink / raw) To: Kyle McMartin; +Cc: Helge Deller, linux-parisc On Wed, Dec 17, 2008 at 9:36 PM, Kyle McMartin <kyle@infradead.org> wrote: > On Wed, Dec 17, 2008 at 08:46:30PM -0500, Carlos O'Donell wrote: >> On Wed, Dec 17, 2008 at 5:46 PM, Helge Deller <deller@gmx.de> wrote: >> > The attached patch simply adds the lookup for a fixup handler when trap >> > #27 (protection id trap) happens in kernel space. This was missing in >> > the code path for trap #27 which is why the kernel then called >> > die_if_kernel() and crashed. >> > >> > Even with this patch ruby1.9 may fail to compile, but at least the >> > kernel crashes are gone. >> > >> > Any feedback welcome. >> >> Can we avoid copying code from do_page_fault and just call do_page_fault? >> >> Shouldn't a data protection id trap on a userspace address cause >> do_page_fault to be called which triggers the lookup and transfer to >> the exception table location? >> > > Yes, it could just fall through in this case, but I don't think that's > necessarily correct either. Shouldn't it always be wrong to call die_if_kernel on protection id trap without searching the exception tables? This is exactly what happens in the case of trap 27. >> I remember testing this for the compare-and-swap kernel LWS, in >> particular calling with an invalid address, where the kernel must load >> a userspace address, possibly fault, and jump to the exception >> location to return EFAULT as the error for the compare-and-swap. > > Indeed, there's something more sinister going on. It might be that a different fault is generated? I ran my test-lws program again this morning and it correctly returns EFAULT which means the exception handling should be working, *but* I don't know which trap was generated. The ldw could have generated a data tlb miss fault/data page fault trap, and that is correctly handled. A data address memory protection id trap only happens if: - PSW P-bit is 1. - Prot. ID check fails? (What memory has prot id checks on it?) - One of load/Store/Ldcw/cache purge references the memory. Cheers, Carlos. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-17 22:46 [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Helge Deller 2008-12-18 0:05 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling John David Anglin 2008-12-18 1:46 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Carlos O'Donell @ 2008-12-18 7:03 ` Kyle McMartin 2008-12-18 13:09 ` Helge Deller 2008-12-19 21:29 ` Kyle McMartin 3 siblings, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-18 7:03 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc On Wed, Dec 17, 2008 at 11:46:05PM +0100, Helge Deller wrote: > The Debian bugzilla has a long thread about kernel crashes when > compiling ruby1.9 on hppa. This kernel bug led even to discussions if > hppa should be dropped for lenny. > See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=478717 for details. > > It's really easy to reproduce the bug, and it generates this backtrace > (interestingly two backtraces): > When does it die? My ruby1.9 is building through to the tests (where it's hanging without crashing...) r, Kyle ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-18 7:03 ` Kyle McMartin @ 2008-12-18 13:09 ` Helge Deller 2008-12-18 15:05 ` Kyle McMartin 0 siblings, 1 reply; 56+ messages in thread From: Helge Deller @ 2008-12-18 13:09 UTC (permalink / raw) To: Kyle McMartin; +Cc: linux-parisc -------- Original-Nachricht -------- > On Wed, Dec 17, 2008 at 11:46:05PM +0100, Helge Deller wrote: > > The Debian bugzilla has a long thread about kernel crashes when > > compiling ruby1.9 on hppa. This kernel bug led even to discussions = if > > hppa should be dropped for lenny. > > See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D478717 for d= etails. > >=20 > > It's really easy to reproduce the bug, and it generates this backtr= ace > > (interestingly two backtraces): >=20 > When does it die? My ruby1.9 is building through to the tests (where > it's hanging without crashing...) If it does not happen directly for you, just try a few times/cycles, e.= g. make, make clean in the build directory. Even killing the hanging miniruby processes can make the kernel die (wh= en my patch wasn't applied). With the patch it stays hanging. Helge --=20 Psssst! Schon vom neuen GMX MultiMessenger geh=F6rt? Der kann`s mit all= en: http://www.gmx.net/de/go/multimessenger -- To unsubscribe from this list: send the line "unsubscribe linux-parisc"= in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-18 13:09 ` Helge Deller @ 2008-12-18 15:05 ` Kyle McMartin 0 siblings, 0 replies; 56+ messages in thread From: Kyle McMartin @ 2008-12-18 15:05 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc On Thu, Dec 18, 2008 at 02:09:40PM +0100, Helge Deller wrote: > -------- Original-Nachricht -------- > > On Wed, Dec 17, 2008 at 11:46:05PM +0100, Helge Deller wrote: > > > The Debian bugzilla has a long thread about kernel crashes when > > > compiling ruby1.9 on hppa. This kernel bug led even to discussions if > > > hppa should be dropped for lenny. > > > See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=478717 for details. > > > > > > It's really easy to reproduce the bug, and it generates this backtrace > > > (interestingly two backtraces): > > > > When does it die? My ruby1.9 is building through to the tests (where > > it's hanging without crashing...) > > If it does not happen directly for you, just try a few times/cycles, e.g. make, make clean in the build directory. > Even killing the hanging miniruby processes can make the kernel die (when my patch wasn't applied). With the patch it stays hanging. > Aside from the testsuite hanging it seems to be building fine for me no matter how many times I let it go... making ruby1.9 make[1]: Entering directory `/home/kyle/src/ruby1.9-1.9.0.2' cc -fno-strict-aliasing -g -g -O2 -O2 -g -Wall -Wno-parentheses -fPIC -L. -rdynamic -Wl,-export-dynamic main.o -lruby1.9 -lpthread -lrt -ldl -lcrypt -lm -o ruby1.9 make[1]: Leaving directory `/home/kyle/src/ruby1.9-1.9.0.2' Let me debootstrap on another machine and try it there, sigh. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-17 22:46 [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Helge Deller ` (2 preceding siblings ...) 2008-12-18 7:03 ` Kyle McMartin @ 2008-12-19 21:29 ` Kyle McMartin 2008-12-19 22:59 ` Helge Deller 2008-12-21 15:20 ` John David Anglin 3 siblings, 2 replies; 56+ messages in thread From: Kyle McMartin @ 2008-12-19 21:29 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc On Wed, Dec 17, 2008 at 11:46:05PM +0100, Helge Deller wrote: > Honestly, I can't decide whether to apply this. It really should never happen in the kernel, since the kernel can guarantee it won't get the access rights failure (highest privilege level, and can set %sr and %protid to whatever it wants.) It really genuinely is a bug that probably should panic the kernel. The only precedent I can easily see is x86 fixing up a bad iret with a general protection fault, which is more or less analogous to code 27 here. On the other hand, taking the exception on a userspace access really isn't all that critical, and there's fundamentally little reason for the kernel not to SIGSEGV the process, and continue... Argh. (btw, I've instrumented my do_sys_poll with a pile of assertions that %cr8 << 1 == %sr3 == current->mm.context... let's see if where we're getting corrupted is deterministic, though, I would guess that it won't be.) diff --git a/arch/parisc/include/asm/uaccess.h b/arch/parisc/include/asm/uaccess.h index 4878b95..1c6dbb6 100644 --- a/arch/parisc/include/asm/uaccess.h +++ b/arch/parisc/include/asm/uaccess.h @@ -241,4 +241,6 @@ unsigned long copy_in_user(void __user *dst, const void __user *src, unsigned lo #define __copy_to_user_inatomic __copy_to_user #define __copy_from_user_inatomic __copy_from_user +int fixup_exception(struct pt_regs *regs); + #endif /* __PARISC_UACCESS_H */ diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c index 4c771cd..548ba0c 100644 --- a/arch/parisc/kernel/traps.c +++ b/arch/parisc/kernel/traps.c @@ -745,6 +745,10 @@ void handle_interruption(int code, struct pt_regs *regs) /* Fall Through */ case 27: /* Data memory protection ID trap */ + if (code == 27 && !user_mode(regs) && + fixup_exception(regs)) + return; + die_if_kernel("Protection id trap", regs, code); si.si_code = SEGV_MAPERR; si.si_signo = SIGSEGV; diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index b2e3e9a..92c7fa4 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -139,13 +139,41 @@ parisc_acctyp(unsigned long code, unsigned int inst) } #endif +int fixup_exception(struct pt_regs *regs) +{ + const struct exception_table_entry *fix; + + fix = search_exception_tables(regs->iaoq[0]); + if (fix) { + struct exception_data *d; + d = &__get_cpu_var(exception_data); + d->fault_ip = regs->iaoq[0]; + d->fault_space = regs->isr; + d->fault_addr = regs->ior; + + regs->iaoq[0] = ((fix->fixup) & ~3); + /* + * NOTE: In some cases the faulting instruction + * may be in the delay slot of a branch. We + * don't want to take the branch, so we don't + * increment iaoq[1], instead we set it to be + * iaoq[0]+4, and clear the B bit in the PSW + */ + regs->iaoq[1] = regs->iaoq[0] + 4; + regs->gr[0] &= ~PSW_B; /* IPSW in gr[0] */ + + return 1; + } + + return 0; +} + void do_page_fault(struct pt_regs *regs, unsigned long code, unsigned long address) { struct vm_area_struct *vma, *prev_vma; struct task_struct *tsk = current; struct mm_struct *mm = tsk->mm; - const struct exception_table_entry *fix; unsigned long acc_type; int fault; @@ -229,32 +257,8 @@ bad_area: no_context: - if (!user_mode(regs)) { - fix = search_exception_tables(regs->iaoq[0]); - - if (fix) { - struct exception_data *d; - - d = &__get_cpu_var(exception_data); - d->fault_ip = regs->iaoq[0]; - d->fault_space = regs->isr; - d->fault_addr = regs->ior; - - regs->iaoq[0] = ((fix->fixup) & ~3); - - /* - * NOTE: In some cases the faulting instruction - * may be in the delay slot of a branch. We - * don't want to take the branch, so we don't - * increment iaoq[1], instead we set it to be - * iaoq[0]+4, and clear the B bit in the PSW - */ - - regs->iaoq[1] = regs->iaoq[0] + 4; - regs->gr[0] &= ~PSW_B; /* IPSW in gr[0] */ ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-19 21:29 ` Kyle McMartin @ 2008-12-19 22:59 ` Helge Deller 2008-12-19 23:34 ` Kyle McMartin 2008-12-21 15:20 ` John David Anglin 1 sibling, 1 reply; 56+ messages in thread From: Helge Deller @ 2008-12-19 22:59 UTC (permalink / raw) To: Kyle McMartin; +Cc: linux-parisc Kyle McMartin wrote: > Honestly, I can't decide whether to apply this. It really should never > happen in the kernel, since the kernel can guarantee it won't get the > access rights failure (highest privilege level, and can set %sr and > %protid to whatever it wants.) Yes, that's the problem. We catch a bug in the kernel which shouldn't be there (and which we can't find yet). If it wouldn't be there, we wouldn't need this patch.. Chicken and egg problem. > It really genuinely is a bug that probably should panic the kernel. The > only precedent I can easily see is x86 fixing up a bad iret with a > general protection fault, which is more or less analogous to code 27 > here. > > On the other hand, taking the exception on a userspace access really > isn't all that critical, and there's fundamentally little reason for the > kernel not to SIGSEGV the process, and continue... > > Argh. Argh. Argh :-) > (btw, I've instrumented my do_sys_poll with a pile of assertions that > %cr8 << 1 == %sr3 == current->mm.context... let's see if where we're > getting corrupted is deterministic, though, I would guess that it won't > be.) I'll continue to debug as well. I have some more ideas left to try... Thanks for cleaning up my patch and splitting out the fixup_exception() code. Your patch is fine and I think you should apply it (with one minor addition - see below). The reason I think why you should apply it is, that we shouldn't let the kernel crash on (correct) userspace apps. Maybe adding a comment that this check in trap #27 should be removed when the bug is fixed would be good though. > diff --git a/arch/parisc/include/asm/uaccess.h b/arch/parisc/include/asm/uaccess.h > index 4878b95..1c6dbb6 100644 > --- a/arch/parisc/include/asm/uaccess.h > +++ b/arch/parisc/include/asm/uaccess.h > @@ -745,6 +745,10 @@ void handle_interruption(int code, struct pt_regs *regs) > /* Fall Through */ > case 27: > /* Data memory protection ID trap */ > + if (code == 27 && !user_mode(regs) && > + fixup_exception(regs)) > + return; I think you should add a self-explaining printk(KERN_CRIT,..) here, so that we can track when this bug appeared in userspace. Something like: printk("strange data protection ID trap happened to <procname> PID: xxx. (this is a bug in the kernel)\n"); (you probably find a better wording) Having a correctly-coded userspace process which suddenly just return failures in syscalls is not good and would make people wonder and unnecessarily try to debug their (correct) userspace apps... Helge ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 2008-12-19 22:59 ` Helge Deller @ 2008-12-19 23:34 ` Kyle McMartin 2008-12-20 17:07 ` [PATCH] [RFC] fix kernel crash (protection id trap) when John David Anglin 0 siblings, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-19 23:34 UTC (permalink / raw) To: Helge Deller; +Cc: Kyle McMartin, linux-parisc On Fri, Dec 19, 2008 at 11:59:11PM +0100, Helge Deller wrote: > Thanks for cleaning up my patch and splitting out the fixup_exception() > code. > Your patch is fine and I think you should apply it (with one minor > addition - see below). > The reason I think why you should apply it is, that we shouldn't let the > kernel crash on (correct) userspace apps. > Maybe adding a comment that this check in trap #27 should be removed > when the bug is fixed would be good though. > Hmm. You raise a good point that userspace is DTRT... Can you test to see if this patch fixes things? My c3000 with an unpatched kernel has suddenly decided it doesn't want to crash anymore... diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c index 4c771cd..73058e8 100644 --- a/arch/parisc/kernel/traps.c +++ b/arch/parisc/kernel/traps.c @@ -745,6 +745,19 @@ void handle_interruption(int code, struct pt_regs *regs) /* Fall Through */ case 27: /* Data memory protection ID trap */ +#define HANDLE_KERNEL_PROTECTION_TRAPS +#ifdef HANDLE_KERNEL_PROTECTION_TRAPS + /* XXX very very evil */ + if (code == 27 && !user_mode(regs)) { + /* bang the spaceid back into sr3 */ + regs->sr[3] = (unsigned long)current->mm.context; + /* TODO? reprogram %cr8? load_context will get it. */ + /* TODO? thread_struct.faulted count to catch + * taking the same fault on the same insn again? + */ + return; + } +#endif /*HANDLE_KERNEL_PROTECTION_TRAPS*/ die_if_kernel("Protection id trap", regs, code); si.si_code = SEGV_MAPERR; si.si_signo = SIGSEGV; ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 23:34 ` Kyle McMartin @ 2008-12-20 17:07 ` John David Anglin 0 siblings, 0 replies; 56+ messages in thread From: John David Anglin @ 2008-12-20 17:07 UTC (permalink / raw) To: Kyle McMartin; +Cc: deller, kyle, linux-parisc > case 27: > /* Data memory protection ID trap */ > +#define HANDLE_KERNEL_PROTECTION_TRAPS > +#ifdef HANDLE_KERNEL_PROTECTION_TRAPS > + /* XXX very very evil */ > + if (code == 27 && !user_mode(regs)) { Should a check be added for regs->sr[3] != (unsigned long)current->mm.context? > + /* bang the spaceid back into sr3 */ > + regs->sr[3] = (unsigned long)current->mm.context; I seem to recall that James thought there was a problem with sr3 handling a year or two ago. > + /* TODO? reprogram %cr8? load_context will get it. */ > + /* TODO? thread_struct.faulted count to catch > + * taking the same fault on the same insn again? Is this a general problem with parisc exception support? The reason I ask is it seems possible for an infinite loop to occur when an application catches exceptions like bash. It catches terminating signals so that it can write the history file. When terminate_immediately is zero, the signal handler (termsig_sighandler) just records the terminating signal and returns. It doesn't reset the default handler for the signal or specify SA_RESETHAND as far as I can tell. It then appears the kernel restarts the application at the point of the exception generating another exception. This causes the kernel log files to fill in short order. I would say bash's behavior in trying to continue after a SEGV is questionable but I also think the kernel shouldn't retry a faulting instruction unless there's a chance the fault has been fixed. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-19 21:29 ` Kyle McMartin 2008-12-19 22:59 ` Helge Deller @ 2008-12-21 15:20 ` John David Anglin 2008-12-21 17:27 ` John David Anglin ` (2 more replies) 1 sibling, 3 replies; 56+ messages in thread From: John David Anglin @ 2008-12-21 15:20 UTC (permalink / raw) To: Kyle McMartin; +Cc: deller, linux-parisc > Honestly, I can't decide whether to apply this. It really should never > happen in the kernel, since the kernel can guarantee it won't get the > access rights failure (highest privilege level, and can set %sr and > %protid to whatever it wants.) I believe this change should be applied because we enable P and the hardware can generate the exception. Yes, it should never happen, but the coding to prevent this occurring is very subtle. > It really genuinely is a bug that probably should panic the kernel. The > only precedent I can easily see is x86 fixing up a bad iret with a > general protection fault, which is more or less analogous to code 27 > here. I don't think we have enough information here. A panic doesn't tell us what we need to know... > On the other hand, taking the exception on a userspace access really > isn't all that critical, and there's fundamentally little reason for the > kernel not to SIGSEGV the process, and continue... Agreed, especially in a __put_user operation. > (btw, I've instrumented my do_sys_poll with a pile of assertions that > %cr8 << 1 == %sr3 == current->mm.context... let's see if where we're > getting corrupted is deterministic, though, I would guess that it won't > be.) Having looked at the code a bit, I have come to the conclusion that the problem is not with the sr3 value. We have a one-to-one mapping between space register values and protection IDs. TLB entries get inserted using the space register value for the protection ID (see entry.S). So, I think we have to have inconsistent values in cr8 and sr3. I think the bug may be in flush_user_cache_page_non_current. It hijacks sr3 temporarily and I don't think cr8 is updated when this is done. The switch may need to be atomic. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 15:20 ` John David Anglin @ 2008-12-21 17:27 ` John David Anglin 2008-12-21 21:33 ` Kyle McMartin 2008-12-21 22:52 ` Helge Deller 2008-12-21 22:20 ` Kyle McMartin 2008-12-21 22:59 ` Helge Deller 2 siblings, 2 replies; 56+ messages in thread From: John David Anglin @ 2008-12-21 17:27 UTC (permalink / raw) To: John David Anglin; +Cc: kyle, deller, linux-parisc > I think the bug may be in flush_user_cache_page_non_current. It hijacks > sr3 temporarily and I don't think cr8 is updated when this is done. The > switch may need to be atomic. The following might fix the protection ID bug. --- cache.c.orig 2008-07-17 21:24:46.000000000 -0400 +++ cache.c 2008-12-21 11:53:54.000000000 -0500 @@ -312,14 +312,14 @@ /* make us current */ mtctl(__pa(vma->vm_mm->pgd), 25); - mtsp(vma->vm_mm->context, 3); + load_context(vma->vm_mm->context); flush_user_dcache_page(vmaddr); if(vma->vm_flags & VM_EXEC) flush_user_icache_page(vmaddr); /* put the old current process back */ - mtsp(space, 3); + load_context(space); mtctl(pgd, 25); preempt_enable(); } Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 17:27 ` John David Anglin @ 2008-12-21 21:33 ` Kyle McMartin 2008-12-21 22:02 ` Kyle McMartin 2008-12-21 22:11 ` John David Anglin 2008-12-21 22:52 ` Helge Deller 1 sibling, 2 replies; 56+ messages in thread From: Kyle McMartin @ 2008-12-21 21:33 UTC (permalink / raw) To: John David Anglin; +Cc: kyle, deller, linux-parisc On Sun, Dec 21, 2008 at 12:27:42PM -0500, John David Anglin wrote: > > I think the bug may be in flush_user_cache_page_non_current. It hijacks > > sr3 temporarily and I don't think cr8 is updated when this is done. The > > switch may need to be atomic. > > The following might fix the protection ID bug. > I'm not convinced... this doesn't explain why we don't see this on 64-bit... ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 21:33 ` Kyle McMartin @ 2008-12-21 22:02 ` Kyle McMartin 2008-12-21 22:52 ` John David Anglin 2008-12-21 22:11 ` John David Anglin 1 sibling, 1 reply; 56+ messages in thread From: Kyle McMartin @ 2008-12-21 22:02 UTC (permalink / raw) To: Kyle McMartin; +Cc: John David Anglin, deller, linux-parisc On Sun, Dec 21, 2008 at 04:33:03PM -0500, Kyle McMartin wrote: > On Sun, Dec 21, 2008 at 12:27:42PM -0500, John David Anglin wrote: > > > I think the bug may be in flush_user_cache_page_non_current. It hijacks > > > sr3 temporarily and I don't think cr8 is updated when this is done. The > > > switch may need to be atomic. > > > > The following might fix the protection ID bug. > > > > I'm not convinced... this doesn't explain why we don't see this on > 64-bit... > Nope, no love... :\ ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 22:02 ` Kyle McMartin @ 2008-12-21 22:52 ` John David Anglin 0 siblings, 0 replies; 56+ messages in thread From: John David Anglin @ 2008-12-21 22:52 UTC (permalink / raw) To: Kyle McMartin; +Cc: kyle, deller, linux-parisc > Nope, no love... :\ Sigh, I knew it. Did any of your instrumentation trigger? Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 21:33 ` Kyle McMartin 2008-12-21 22:02 ` Kyle McMartin @ 2008-12-21 22:11 ` John David Anglin 1 sibling, 0 replies; 56+ messages in thread From: John David Anglin @ 2008-12-21 22:11 UTC (permalink / raw) To: Kyle McMartin; +Cc: kyle, deller, linux-parisc > On Sun, Dec 21, 2008 at 12:27:42PM -0500, John David Anglin wrote: > > > I think the bug may be in flush_user_cache_page_non_current. It hijacks > > > sr3 temporarily and I don't think cr8 is updated when this is done. The > > > switch may need to be atomic. > > > > The following might fix the protection ID bug. > > > > I'm not convinced... this doesn't explain why we don't see this on > 64-bit... Agreed. However, we do have tlb issues on 64-bit and this is just about the only place where sr3 is messed with and cr8 isn't correctly set. I'm sure the bug is somewhat timing dependent. There has to be something else that causes a userspace access with inconsistent space and protection IDs. The cache flush can cause non access tlb misses, and there is some difference in tlb hardware, but cr8 isn't involved in the tlb miss handling as far as I can see. I'm running with the patch on 2.6.28-rc8 (SMP) and 2.6.22.19 (UP). Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 17:27 ` John David Anglin 2008-12-21 21:33 ` Kyle McMartin @ 2008-12-21 22:52 ` Helge Deller 2008-12-21 22:58 ` John David Anglin 2008-12-22 22:43 ` James Bottomley 1 sibling, 2 replies; 56+ messages in thread From: Helge Deller @ 2008-12-21 22:52 UTC (permalink / raw) To: John David Anglin; +Cc: kyle, linux-parisc John David Anglin wrote: >> I think the bug may be in flush_user_cache_page_non_current. It hijacks >> sr3 temporarily and I don't think cr8 is updated when this is done. The >> switch may need to be atomic. > > The following might fix the protection ID bug. > > --- cache.c.orig 2008-07-17 21:24:46.000000000 -0400 > +++ cache.c 2008-12-21 11:53:54.000000000 -0500 > @@ -312,14 +312,14 @@ > > /* make us current */ > mtctl(__pa(vma->vm_mm->pgd), 25); > - mtsp(vma->vm_mm->context, 3); > + load_context(vma->vm_mm->context); > > flush_user_dcache_page(vmaddr); > if(vma->vm_flags & VM_EXEC) > flush_user_icache_page(vmaddr); > > /* put the old current process back */ > - mtsp(space, 3); > + load_context(space); I came to the similar conclusion and tried exactly this patch earlier today. It didn't fixed the problem (although I had the feeling that the bug didn't appeared as often then). Helge ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 22:52 ` Helge Deller @ 2008-12-21 22:58 ` John David Anglin 2008-12-21 23:08 ` Helge Deller 2008-12-22 22:43 ` James Bottomley 1 sibling, 1 reply; 56+ messages in thread From: John David Anglin @ 2008-12-21 22:58 UTC (permalink / raw) To: Helge Deller; +Cc: kyle, linux-parisc > > - mtsp(space, 3); > > + load_context(space); > > I came to the similar conclusion and tried exactly this patch earlier > today. It didn't fixed the problem (although I had the feeling that the > bug didn't appeared as often then). Ok, then maybe load_context needs to be atomic. This is a bit tricky because we may have to ensure that no tlb misses are triggered (relied upon translation) during the update. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 22:58 ` John David Anglin @ 2008-12-21 23:08 ` Helge Deller 2008-12-22 0:07 ` John David Anglin 0 siblings, 1 reply; 56+ messages in thread From: Helge Deller @ 2008-12-21 23:08 UTC (permalink / raw) To: John David Anglin; +Cc: kyle, linux-parisc John David Anglin wrote: >>> - mtsp(space, 3); >>> + load_context(space); >> I came to the similar conclusion and tried exactly this patch earlier >> today. It didn't fixed the problem (although I had the feeling that the >> bug didn't appeared as often then). > > Ok, then maybe load_context needs to be atomic. This is a bit tricky > because we may have to ensure that no tlb misses are triggered (relied > upon translation) during the update. I'll try tomorrow. What makes me wondering: a) the bug always triggers AFAICS with applications which uses threads (for the ruby1.9 problem it's always the miniruby process). Maybe the problem happens to something being wrong in the signal handler with threadened applications, e.g. arch/parisc/kernel/signal.c:648 ? b) maybe stupid question: In the case it's a generic processor problem, would e.g. changing the kernel to use sr4 instead of sr3 for userspace-accesses change something? What does HPUX uses? At least one could try...? Helge ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 23:08 ` Helge Deller @ 2008-12-22 0:07 ` John David Anglin 0 siblings, 0 replies; 56+ messages in thread From: John David Anglin @ 2008-12-22 0:07 UTC (permalink / raw) To: Helge Deller; +Cc: kyle, linux-parisc > John David Anglin wrote: > >>> - mtsp(space, 3); > >>> + load_context(space); > >> I came to the similar conclusion and tried exactly this patch earlier > >> today. It didn't fixed the problem (although I had the feeling that the > >> bug didn't appeared as often then). > > > > Ok, then maybe load_context needs to be atomic. This is a bit tricky > > because we may have to ensure that no tlb misses are triggered (relied > > upon translation) during the update. > > I'll try tomorrow. Another thing that I'm wondering about is the following. The tlb miss handlers assume the following: * cr24 contains a pointer to the kernel address space * page directory. * * cr25 contains a pointer to the current user address * space page directory. * * sr3 will contain the space id of the user address space * of the current running thread while that thread is * running in the kernel. Possibly, load_context needs to update cr25 as well. Assume cr24 never changes. > What makes me wondering: > > a) the bug always triggers AFAICS with applications which uses threads > (for the ruby1.9 problem it's always the miniruby process). Maybe the > problem happens to something being wrong in the signal handler with > threadened applications, e.g. arch/parisc/kernel/signal.c:648 ? In my gcc builds, it's bash and make that experience the majority of unexplained segvs. There is an issue with the signal handler and bash that causes a loop, however I think the initial fault was caused by a tlb issue. > b) maybe stupid question: In the case it's a generic processor problem, > would e.g. changing the kernel to use sr4 instead of sr3 for > userspace-accesses change something? What does HPUX uses? At least one > could try...? Personally, it's not clear to me that this is just a problem with kernel userspace accesses. If sr3 is corrupt in the kernel, sr7 will be corrupt in userspace. Think the only thing special about sr3 is that the kernel changes it for cache flushes, forks, etc. Your comment that it's sr3 that's wrong suggests a problem with context switches, particularly since the corrupt value is close to the correct value. If sr3 and cr8 are still inconsistent after the patch to flush_user_cache_page_non_current, we must be missing a mechanism that updates sr3. One thought is to load cr8 before sr3 in load_context and see what happens. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 22:52 ` Helge Deller 2008-12-21 22:58 ` John David Anglin @ 2008-12-22 22:43 ` James Bottomley 2008-12-22 22:46 ` Kyle McMartin 2008-12-23 2:31 ` John David Anglin 1 sibling, 2 replies; 56+ messages in thread From: James Bottomley @ 2008-12-22 22:43 UTC (permalink / raw) To: Helge Deller; +Cc: John David Anglin, kyle, linux-parisc On Sun, 2008-12-21 at 23:52 +0100, Helge Deller wrote: > John David Anglin wrote: > >> I think the bug may be in flush_user_cache_page_non_current. It hijacks > >> sr3 temporarily and I don't think cr8 is updated when this is done. The > >> switch may need to be atomic. > > > > The following might fix the protection ID bug. > > > > --- cache.c.orig 2008-07-17 21:24:46.000000000 -0400 > > +++ cache.c 2008-12-21 11:53:54.000000000 -0500 > > @@ -312,14 +312,14 @@ > > > > /* make us current */ > > mtctl(__pa(vma->vm_mm->pgd), 25); > > - mtsp(vma->vm_mm->context, 3); > > + load_context(vma->vm_mm->context); > > > > flush_user_dcache_page(vmaddr); > > if(vma->vm_flags & VM_EXEC) > > flush_user_icache_page(vmaddr); > > > > /* put the old current process back */ > > - mtsp(space, 3); > > + load_context(space); > > I came to the similar conclusion and tried exactly this patch earlier > today. It didn't fixed the problem (although I had the feeling that the > bug didn't appeared as often then). Actually, it should bug more often. This function: flush_user_cache_page_non_current() is very rarely called (which is hopefully why you don't see an increase in bugs). However, this is a kernel function ... if you call load_context() here, you'll get the user protection IDs in the register and it will immediately fault when it returns to the kernel. All it should be doing (which is what it currently does) is to set up sr3 to allow the kernel to poke into a user address space, which is the design of the function. James ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-22 22:43 ` James Bottomley @ 2008-12-22 22:46 ` Kyle McMartin 2008-12-23 2:31 ` John David Anglin 1 sibling, 0 replies; 56+ messages in thread From: Kyle McMartin @ 2008-12-22 22:46 UTC (permalink / raw) To: James Bottomley; +Cc: Helge Deller, John David Anglin, kyle, linux-parisc On Mon, Dec 22, 2008 at 04:43:36PM -0600, James Bottomley wrote: > Actually, it should bug more often. This function: > flush_user_cache_page_non_current() is very rarely called (which is > hopefully why you don't see an increase in bugs). However, this is a > kernel function ... if you call load_context() here, you'll get the user > protection IDs in the register and it will immediately fault when it > returns to the kernel. All it should be doing (which is what it > currently does) is to set up sr3 to allow the kernel to poke into a user > address space, which is the design of the function. > Well, it will work by accident, since the other 3 (or 7) prot id registers will still be zeroed, since we only ever touch %cr8... ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-22 22:43 ` James Bottomley 2008-12-22 22:46 ` Kyle McMartin @ 2008-12-23 2:31 ` John David Anglin 2008-12-23 2:54 ` Kyle McMartin 1 sibling, 1 reply; 56+ messages in thread From: John David Anglin @ 2008-12-23 2:31 UTC (permalink / raw) To: James Bottomley; +Cc: deller, kyle, linux-parisc > Actually, it should bug more often. This function: > flush_user_cache_page_non_current() is very rarely called (which is > hopefully why you don't see an increase in bugs). However, this is a > kernel function ... if you call load_context() here, you'll get the user > protection IDs in the register and it will immediately fault when it > returns to the kernel. All it should be doing (which is what it > currently does) is to set up sr3 to allow the kernel to poke into a user > address space, which is the design of the function. As things stand, the call is fully inlined. Nothing can be poked into the user address space without the protection ID register being correctly set. The cache flushes can trigger a non-access tlb miss fault. At the moment, I can't see how this function could cause a protection ID fault. However, it still seems good practice to set cr8 consistently when the user context is changed. The kernel uses space register values of zero, so it shouldn't fault because of the change to sr3 and cr8 for non current. The values are restored before flush_user_cache_page_non_current() exits. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-23 2:31 ` John David Anglin @ 2008-12-23 2:54 ` Kyle McMartin 2008-12-23 3:15 ` John David Anglin 2008-12-23 13:13 ` John David Anglin 0 siblings, 2 replies; 56+ messages in thread From: Kyle McMartin @ 2008-12-23 2:54 UTC (permalink / raw) To: John David Anglin; +Cc: James Bottomley, deller, kyle, linux-parisc On Mon, Dec 22, 2008 at 09:31:10PM -0500, John David Anglin wrote: > > Actually, it should bug more often. This function: > > flush_user_cache_page_non_current() is very rarely called (which is > > hopefully why you don't see an increase in bugs). However, this is a > > kernel function ... if you call load_context() here, you'll get the user > > protection IDs in the register and it will immediately fault when it > > returns to the kernel. All it should be doing (which is what it > > currently does) is to set up sr3 to allow the kernel to poke into a user > > address space, which is the design of the function. > > As things stand, the call is fully inlined. Nothing can be poked into > the user address space without the protection ID register being correctly > set. The cache flushes can trigger a non-access tlb miss fault. At > the moment, I can't see how this function could cause a protection ID fault. > However, it still seems good practice to set cr8 consistently when the > user context is changed. > > The kernel uses space register values of zero, so it shouldn't fault > because of the change to sr3 and cr8 for non current. The values are > restored before flush_user_cache_page_non_current() exits. > just fyi, jejb sleuthed out that it was flush_tlb_mm's non-SMP path (which "flushes" the tlb by allocating a new spaceid) that was causing the problem, and helge verified that it fixes the bug. i'm confirming now with a make && make clean loop building ruby, if it survives the night, i'll commit the patch tomorrow and push it to stable until we have a clean(er) fix. diff --git a/arch/parisc/include/asm/tlbflush.h b/arch/parisc/include/asm/tlbflush.h index b72ec66..1f6fd4f 100644 --- a/arch/parisc/include/asm/tlbflush.h +++ b/arch/parisc/include/asm/tlbflush.h @@ -44,9 +44,12 @@ static inline void flush_tlb_mm(struct mm_struct *mm) { BUG_ON(mm == &init_mm); /* Should never happen */ -#ifdef CONFIG_SMP +#if 1 || defined(CONFIG_SMP) flush_tlb_all(); #else + /* FIXME: currently broken, causing space id and protection ids + * to go out of sync, resulting in faults on userspace accesses. + */ if (mm) { if (mm->context != 0) free_sid(mm->context); ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-23 2:54 ` Kyle McMartin @ 2008-12-23 3:15 ` John David Anglin 2008-12-23 13:13 ` John David Anglin 1 sibling, 0 replies; 56+ messages in thread From: John David Anglin @ 2008-12-23 3:15 UTC (permalink / raw) To: Kyle McMartin; +Cc: James.Bottomley, deller, kyle, linux-parisc > just fyi, jejb sleuthed out that it was flush_tlb_mm's non-SMP path > (which "flushes" the tlb by allocating a new spaceid) that was causing > the problem, and helge verified that it fixes the bug. That's great news! I'd noticed this code tonight and wondered why it wasn't loading cr25. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-23 2:54 ` Kyle McMartin 2008-12-23 3:15 ` John David Anglin @ 2008-12-23 13:13 ` John David Anglin 1 sibling, 0 replies; 56+ messages in thread From: John David Anglin @ 2008-12-23 13:13 UTC (permalink / raw) To: Kyle McMartin; +Cc: James.Bottomley, deller, kyle, linux-parisc > just fyi, jejb sleuthed out that it was flush_tlb_mm's non-SMP path > (which "flushes" the tlb by allocating a new spaceid) that was causing > the problem, and helge verified that it fixes the bug. It seems this code can be called from drivers. Maybe the context update needs to be surrounded by task_lock(current)...task_unlock(current). Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 15:20 ` John David Anglin 2008-12-21 17:27 ` John David Anglin @ 2008-12-21 22:20 ` Kyle McMartin 2008-12-21 22:48 ` John David Anglin 2008-12-22 22:46 ` James Bottomley 2008-12-21 22:59 ` Helge Deller 2 siblings, 2 replies; 56+ messages in thread From: Kyle McMartin @ 2008-12-21 22:20 UTC (permalink / raw) To: John David Anglin; +Cc: Kyle McMartin, deller, linux-parisc On Sun, Dec 21, 2008 at 10:20:16AM -0500, John David Anglin wrote: > > Honestly, I can't decide whether to apply this. It really should never > > happen in the kernel, since the kernel can guarantee it won't get the > > access rights failure (highest privilege level, and can set %sr and > > %protid to whatever it wants.) > > I believe this change should be applied because we enable P and the > hardware can generate the exception. Yes, it should never happen, but > the coding to prevent this occurring is very subtle. > If we're going to paper over a bug like this, we might as well just turn the P-bit off... ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 22:20 ` Kyle McMartin @ 2008-12-21 22:48 ` John David Anglin 2008-12-22 22:46 ` James Bottomley 1 sibling, 0 replies; 56+ messages in thread From: John David Anglin @ 2008-12-21 22:48 UTC (permalink / raw) To: Kyle McMartin; +Cc: kyle, deller, linux-parisc > > On Sun, Dec 21, 2008 at 10:20:16AM -0500, John David Anglin wrote: > > > Honestly, I can't decide whether to apply this. It really should never > > > happen in the kernel, since the kernel can guarantee it won't get the > > > access rights failure (highest privilege level, and can set %sr and > > > %protid to whatever it wants.) > > > > I believe this change should be applied because we enable P and the > > hardware can generate the exception. Yes, it should never happen, but > > the coding to prevent this occurring is very subtle. > > > > If we're going to paper over a bug like this, we might as well just turn > the P-bit off... I just want a segv instead of a panic as I don't think we know whether the kernel is bad or not. This bug has been around for for some time. I believe that it was reported in 2004, or earlier. Even HP has problems. Saw this today: Release Date: Nov 19 2008 Description : PHKL_39110: ( QX:QXCR1000827621 ) The system can panic with the following panic string and stack trace. Panic string : Data memory protection/access rights/alignment fault FUNC panic+0x6c report_trap_or_int_and_panic+0x94 interrupt+0x4e4 ihandler+0x940 +------------- TRAP ---------------------------- | Trap type 18 in KERNEL mode at 0x38f50 (lbcopy+0x1f0) | p struct save_state 0.0x189cc60 +------------- TRAP ---------------- Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 22:20 ` Kyle McMartin 2008-12-21 22:48 ` John David Anglin @ 2008-12-22 22:46 ` James Bottomley 2008-12-22 22:47 ` Kyle McMartin 1 sibling, 1 reply; 56+ messages in thread From: James Bottomley @ 2008-12-22 22:46 UTC (permalink / raw) To: Kyle McMartin; +Cc: John David Anglin, deller, linux-parisc On Sun, 2008-12-21 at 17:20 -0500, Kyle McMartin wrote: > On Sun, Dec 21, 2008 at 10:20:16AM -0500, John David Anglin wrote: > > > Honestly, I can't decide whether to apply this. It really should never > > > happen in the kernel, since the kernel can guarantee it won't get the > > > access rights failure (highest privilege level, and can set %sr and > > > %protid to whatever it wants.) > > > > I believe this change should be applied because we enable P and the > > hardware can generate the exception. Yes, it should never happen, but > > the coding to prevent this occurring is very subtle. > > > > If we're going to paper over a bug like this, we might as well just turn > the P-bit off... I'm afraid you can't. The invisibility of the kernel space from user space relies on this, as does quite a lot of the page protections. James ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-22 22:46 ` James Bottomley @ 2008-12-22 22:47 ` Kyle McMartin 0 siblings, 0 replies; 56+ messages in thread From: Kyle McMartin @ 2008-12-22 22:47 UTC (permalink / raw) To: James Bottomley; +Cc: Kyle McMartin, John David Anglin, deller, linux-parisc On Mon, Dec 22, 2008 at 04:46:30PM -0600, James Bottomley wrote: > On Sun, 2008-12-21 at 17:20 -0500, Kyle McMartin wrote: > > On Sun, Dec 21, 2008 at 10:20:16AM -0500, John David Anglin wrote: > > > > Honestly, I can't decide whether to apply this. It really should never > > > > happen in the kernel, since the kernel can guarantee it won't get the > > > > access rights failure (highest privilege level, and can set %sr and > > > > %protid to whatever it wants.) > > > > > > I believe this change should be applied because we enable P and the > > > hardware can generate the exception. Yes, it should never happen, but > > > the coding to prevent this occurring is very subtle. > > > > > > > If we're going to paper over a bug like this, we might as well just turn > > the P-bit off... > > I'm afraid you can't. The invisibility of the kernel space from user > space relies on this, as does quite a lot of the page protections. > Access right bits are still enforced, just the access id checking is disabled when the P-bit is off. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] [RFC] fix kernel crash (protection id trap) when 2008-12-21 15:20 ` John David Anglin 2008-12-21 17:27 ` John David Anglin 2008-12-21 22:20 ` Kyle McMartin @ 2008-12-21 22:59 ` Helge Deller 2 siblings, 0 replies; 56+ messages in thread From: Helge Deller @ 2008-12-21 22:59 UTC (permalink / raw) To: John David Anglin; +Cc: Kyle McMartin, linux-parisc John David Anglin wrote: > Having looked at the code a bit, I have come to the conclusion that the > problem is not with the sr3 value. In all debugging I added I could _always_ see that sr3 had the wrong values. cr8 and mm->context were always correct. Interestingly it was often just a few numbers off in the lower 4 bits, e.g. sr3 should have been 0x1a8 but was 0x1a2 (just an example). > We have a one-to-one mapping between > space register values and protection IDs. TLB entries get inserted using > the space register value for the protection ID (see entry.S). So, I think > we have to have inconsistent values in cr8 and sr3. Again, only sr3 is wrong. cr8 is correct. > I think the bug may be in flush_user_cache_page_non_current. It hijacks > sr3 temporarily and I don't think cr8 is updated when this is done. The > switch may need to be atomic. I tried that as well, e.g. adding local_irq_[en|dis]able didn't worked (it crashed IIRC). Helge ^ permalink raw reply [flat|nested] 56+ messages in thread
end of thread, other threads:[~2008-12-23 13:13 UTC | newest] Thread overview: 56+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-12-17 22:46 [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Helge Deller 2008-12-18 0:05 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling John David Anglin 2008-12-18 0:43 ` Kyle McMartin 2008-12-18 3:02 ` [PATCH] [RFC] fix kernel crash (protection id trap) when John David Anglin 2008-12-18 3:05 ` Kyle McMartin 2008-12-18 4:04 ` Kyle McMartin 2008-12-18 16:16 ` Carlos O'Donell 2008-12-18 18:28 ` Kyle McMartin 2008-12-19 15:25 ` Carlos O'Donell 2008-12-19 16:13 ` John David Anglin 2008-12-19 16:23 ` Carlos O'Donell 2008-12-19 16:28 ` Kyle McMartin 2008-12-19 16:35 ` Carlos O'Donell 2008-12-19 16:36 ` Carlos O'Donell 2008-12-19 16:44 ` Kyle McMartin 2008-12-19 17:28 ` Carlos O'Donell 2008-12-19 17:33 ` Kyle McMartin 2008-12-19 17:36 ` Carlos O'Donell 2008-12-19 17:39 ` Kyle McMartin 2008-12-19 17:42 ` Kyle McMartin 2008-12-19 18:43 ` Carlos O'Donell 2008-12-18 1:46 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Carlos O'Donell 2008-12-18 2:02 ` Carlos O'Donell 2008-12-18 13:13 ` Helge Deller 2008-12-18 15:28 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling John David Anglin 2008-12-18 16:09 ` Carlos O'Donell 2008-12-18 2:36 ` [PATCH] [RFC] fix kernel crash (protection id trap) when compiling ruby1.9 Kyle McMartin 2008-12-18 12:21 ` Carlos O'Donell 2008-12-18 7:03 ` Kyle McMartin 2008-12-18 13:09 ` Helge Deller 2008-12-18 15:05 ` Kyle McMartin 2008-12-19 21:29 ` Kyle McMartin 2008-12-19 22:59 ` Helge Deller 2008-12-19 23:34 ` Kyle McMartin 2008-12-20 17:07 ` [PATCH] [RFC] fix kernel crash (protection id trap) when John David Anglin 2008-12-21 15:20 ` John David Anglin 2008-12-21 17:27 ` John David Anglin 2008-12-21 21:33 ` Kyle McMartin 2008-12-21 22:02 ` Kyle McMartin 2008-12-21 22:52 ` John David Anglin 2008-12-21 22:11 ` John David Anglin 2008-12-21 22:52 ` Helge Deller 2008-12-21 22:58 ` John David Anglin 2008-12-21 23:08 ` Helge Deller 2008-12-22 0:07 ` John David Anglin 2008-12-22 22:43 ` James Bottomley 2008-12-22 22:46 ` Kyle McMartin 2008-12-23 2:31 ` John David Anglin 2008-12-23 2:54 ` Kyle McMartin 2008-12-23 3:15 ` John David Anglin 2008-12-23 13:13 ` John David Anglin 2008-12-21 22:20 ` Kyle McMartin 2008-12-21 22:48 ` John David Anglin 2008-12-22 22:46 ` James Bottomley 2008-12-22 22:47 ` Kyle McMartin 2008-12-21 22:59 ` Helge Deller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox