public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Multi-thread corefiles broken since April
@ 2005-12-08  6:52 Steve Work
  2005-12-11  0:44 ` Andrew Morton
  2005-12-31 14:28 ` Adrian Bunk
  0 siblings, 2 replies; 4+ messages in thread
From: Steve Work @ 2005-12-08  6:52 UTC (permalink / raw)
  To: linux-kernel

Coredumps from programs with more than one thread show garbage 
information for all threads except the primary.  The problem was 
introduced with:

http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5df240826c90afdc7956f55a004ea6b702df9203

on Apr 16 ("fix crash in entry.S restore_all") and is still present in 
current builds.

"kill -SEGV" this program and "info threads" the resulting corefile to 
see the problem:

#include <pthread.h>
static void* thread_sleep(void* x) { while (1) sleep(30); }
int main(int c, char** v) {
     const static int tcount = 5;
     pthread_t thr[tcount];
     int i;
     for (i=0; i<tcount; ++i)
         pthread_create(&thr[i], NULL, thread_sleep, NULL);
     while (1)
         sleep(30);
     return 0;
}

(gdb) info threads
   7 process 18138  0x00000246 in ?? ()
   6 process 18139  0x00000246 in ?? ()
   5 process 18140  0x00000246 in ?? ()
   4 process 18141  0x00000246 in ?? ()
   3 process 18142  0x00000246 in ?? ()
   2 process 18143  0x00000246 in ?? ()
* 1 process 18137  0xb7e69db6 in nanosleep () from /lib/tls/libc.so.6
(gdb)

All these threads should show a legitimate location (the same spot in 
nanosleep) and do on kernels prior to the commit named above.  (Notice 
one too many threads listed here also -- is this a related problem?)

Commenting out this line (in asm/i386/kernel/process.c:copy_thread) 
fixes the corefiles:

   childregs = (struct pt_regs *) ((unsigned long) childregs - 8);

but presumably re-introduces the crash the original patch was intended 
to fix.  Should this line be conditioned somehow?  Or do the corefile 
write routines need to know about this adjusted offset?

Steve Work

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Multi-thread corefiles broken since April
  2005-12-08  6:52 Multi-thread corefiles broken since April Steve Work
@ 2005-12-11  0:44 ` Andrew Morton
  2005-12-31 14:28 ` Adrian Bunk
  1 sibling, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2005-12-11  0:44 UTC (permalink / raw)
  To: Steve Work; +Cc: linux-kernel

Steve Work <swork@aventail.com> wrote:
>
> Coredumps from programs with more than one thread show garbage 
> information for all threads except the primary.  The problem was 
> introduced with:
> 
> http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5df240826c90afdc7956f55a004ea6b702df9203
> 
> on Apr 16 ("fix crash in entry.S restore_all") and is still present in 
> current builds.

Thanks for working that out.

> "kill -SEGV" this program and "info threads" the resulting corefile to 
> see the problem:
> 
> #include <pthread.h>
> static void* thread_sleep(void* x) { while (1) sleep(30); }
> int main(int c, char** v) {
>      const static int tcount = 5;
>      pthread_t thr[tcount];
>      int i;
>      for (i=0; i<tcount; ++i)
>          pthread_create(&thr[i], NULL, thread_sleep, NULL);
>      while (1)
>          sleep(30);
>      return 0;
> }
> 
> (gdb) info threads
>    7 process 18138  0x00000246 in ?? ()
>    6 process 18139  0x00000246 in ?? ()
>    5 process 18140  0x00000246 in ?? ()
>    4 process 18141  0x00000246 in ?? ()
>    3 process 18142  0x00000246 in ?? ()
>    2 process 18143  0x00000246 in ?? ()
> * 1 process 18137  0xb7e69db6 in nanosleep () from /lib/tls/libc.so.6
> (gdb)
> 
> All these threads should show a legitimate location (the same spot in 
> nanosleep) and do on kernels prior to the commit named above.  (Notice 
> one too many threads listed here also -- is this a related problem?)
> 
> Commenting out this line (in asm/i386/kernel/process.c:copy_thread) 
> fixes the corefiles:
> 
>    childregs = (struct pt_regs *) ((unsigned long) childregs - 8);
> 
> but presumably re-introduces the crash the original patch was intended 
> to fix.  Should this line be conditioned somehow?  Or do the corefile 
> write routines need to know about this adjusted offset?
> 

Yes, I guess fixing up the core output would be the way to fix it.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Multi-thread corefiles broken since April
  2005-12-08  6:52 Multi-thread corefiles broken since April Steve Work
  2005-12-11  0:44 ` Andrew Morton
@ 2005-12-31 14:28 ` Adrian Bunk
  2006-01-01  1:18   ` Stas Sergeev
  1 sibling, 1 reply; 4+ messages in thread
From: Adrian Bunk @ 2005-12-31 14:28 UTC (permalink / raw)
  To: Steve Work, Stas Sergeev; +Cc: linux-kernel

Hi Steve,

please open a bug at http://bugzilla.kernel.org/ for this issue so that 
it doesn't get lost.

@Stas:
It was your patch that broke it, can you look into it?

TIA
Adrian


On Wed, Dec 07, 2005 at 10:52:52PM -0800, Steve Work wrote:
> Coredumps from programs with more than one thread show garbage 
> information for all threads except the primary.  The problem was 
> introduced with:
> 
> http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5df240826c90afdc7956f55a004ea6b702df9203
> 
> on Apr 16 ("fix crash in entry.S restore_all") and is still present in 
> current builds.
> 
> "kill -SEGV" this program and "info threads" the resulting corefile to 
> see the problem:
> 
> #include <pthread.h>
> static void* thread_sleep(void* x) { while (1) sleep(30); }
> int main(int c, char** v) {
>     const static int tcount = 5;
>     pthread_t thr[tcount];
>     int i;
>     for (i=0; i<tcount; ++i)
>         pthread_create(&thr[i], NULL, thread_sleep, NULL);
>     while (1)
>         sleep(30);
>     return 0;
> }
> 
> (gdb) info threads
>   7 process 18138  0x00000246 in ?? ()
>   6 process 18139  0x00000246 in ?? ()
>   5 process 18140  0x00000246 in ?? ()
>   4 process 18141  0x00000246 in ?? ()
>   3 process 18142  0x00000246 in ?? ()
>   2 process 18143  0x00000246 in ?? ()
> * 1 process 18137  0xb7e69db6 in nanosleep () from /lib/tls/libc.so.6
> (gdb)
> 
> All these threads should show a legitimate location (the same spot in 
> nanosleep) and do on kernels prior to the commit named above.  (Notice 
> one too many threads listed here also -- is this a related problem?)
> 
> Commenting out this line (in asm/i386/kernel/process.c:copy_thread) 
> fixes the corefiles:
> 
>   childregs = (struct pt_regs *) ((unsigned long) childregs - 8);
> 
> but presumably re-introduces the crash the original patch was intended 
> to fix.  Should this line be conditioned somehow?  Or do the corefile 
> write routines need to know about this adjusted offset?
> 
> Steve Work

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Multi-thread corefiles broken since April
  2005-12-31 14:28 ` Adrian Bunk
@ 2006-01-01  1:18   ` Stas Sergeev
  0 siblings, 0 replies; 4+ messages in thread
From: Stas Sergeev @ 2006-01-01  1:18 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: Steve Work, linux-kernel, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 331 bytes --]

Hi.

Adrian Bunk wrote:
> On Wed, Dec 07, 2005 at 10:52:52PM -0800, Steve Work wrote:
>> Or do the corefile 
>> write routines need to know about this adjusted offset?
I think so, the attached patch seem to help.

Happy new year and happy hacking!


-----
teach dump_task_regs() about the -8 offset.

Signed-off-by: stsp@aknet.ru


[-- Attachment #2: stkfix.diff --]
[-- Type: text/x-patch, Size: 492 bytes --]

--- linux/arch/i386/kernel/process.c.old	2005-08-07 21:58:25.000000000 +0400
+++ linux/arch/i386/kernel/process.c	2006-01-01 03:03:10.000000000 +0300
@@ -573,7 +573,9 @@
 	struct pt_regs ptregs;
 	
 	ptregs = *(struct pt_regs *)
-		((unsigned long)tsk->thread_info+THREAD_SIZE - sizeof(ptregs));
+		((unsigned long)tsk->thread_info +
+		/* see comments in copy_thread() about -8 */
+		THREAD_SIZE - sizeof(ptregs) - 8);
 	ptregs.xcs &= 0xffff;
 	ptregs.xds &= 0xffff;
 	ptregs.xes &= 0xffff;

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-01-01  1:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-12-08  6:52 Multi-thread corefiles broken since April Steve Work
2005-12-11  0:44 ` Andrew Morton
2005-12-31 14:28 ` Adrian Bunk
2006-01-01  1:18   ` Stas Sergeev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox