All of lore.kernel.org
 help / color / mirror / Atom feed
* [uml-devel] [PATCH] fix for 2.6 scheduler crashes
@ 2004-02-26 18:01 Jeff Dike
  2004-02-26 19:54 ` William Stearns
  0 siblings, 1 reply; 2+ messages in thread
From: Jeff Dike @ 2004-02-26 18:01 UTC (permalink / raw)
  To: wstearns, blaisorblade_spam, user-mode-linux-devel

The patch below fixes the 2.6 scheduler crash.  It's another subtle signal
problem.  This one is caused by siglongjmp restoring the old signal mask
before calling longjmp.  See the comment for the gory details.  I've had
the stress test running for more than an hour with this patch.  Without it,
it would be lucky to go for five minutes.

				Jeff


===== arch/um/include/signal_user.h 1.1 vs edited =====
--- 1.1/arch/um/include/signal_user.h   Fri Sep  6 13:29:28 2002
+++ edited/arch/um/include/signal_user.h        Tue Feb 24 23:02:49 2004
@@ -11,6 +11,8 @@
 extern int change_sig(int signal, int on);
 extern void set_sigstack(void *stack, int size);
 extern void set_handler(int sig, void (*handler)(int), int flags, ...);
+extern int set_signals(int enable);
+extern int get_signals(void);
 
 #endif
 
===== arch/um/kernel/time_kern.c 1.12 vs edited =====
--- 1.12/arch/um/kernel/time_kern.c     Thu Jan  8 07:43:01 2004
+++ edited/arch/um/kernel/time_kern.c   Tue Feb 24 08:48:39 2004
@@ -102,10 +102,12 @@
 
 irqreturn_t um_timer(int irq, void *dev, struct pt_regs *regs)
 {
+       unsigned long flags;
+
        do_timer(regs);
-       write_seqlock_irq(&xtime_lock);
+       write_seqlock_irqsave(&xtime_lock, flags);
        timer();
-       write_sequnlock_irq(&xtime_lock);
+       write_sequnlock_irqrestore(&xtime_lock, flags);
        return(IRQ_HANDLED);
 }
 
===== arch/um/kernel/skas/process.c 1.9 vs edited =====
--- 1.9/arch/um/kernel/skas/process.c   Thu Jan  8 07:43:01 2004
+++ edited/arch/um/kernel/skas/process.c        Tue Feb 24 23:00:59 2004
@@ -188,13 +188,25 @@
 void new_thread(void *stack, void **switch_buf_ptr, void **fork_buf_ptr,
                void (*handler)(int))
 {
+       unsigned long flags;
        jmp_buf switch_buf, fork_buf;
 
        *switch_buf_ptr = &switch_buf;
        *fork_buf_ptr = &fork_buf;
 
+       /* Somewhat subtle - siglongjmp restores the signal mask before doing
+        * the longjmp.  This means that when jumping from one stack to another
+        * when the target stack has interrupts enabled, an interrupt may occur
+        * on the source stack.  This is bad when starting up a process because
+        * it's not supposed to get timer ticks until it has been scheduled.
+        * So, we disable interrupts around the sigsetjmp to ensure that 
+        * they can't happen until we get back here where they are safe.
+        */
+       flags = get_signals();
+       block_signals();
        if(sigsetjmp(fork_buf, 1) == 0)
                new_thread_proc(stack, handler);
+       set_signals(flags);
 
        remove_sigstack();
 }


-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [uml-devel] [PATCH] fix for 2.6 scheduler crashes
  2004-02-26 18:01 [uml-devel] [PATCH] fix for 2.6 scheduler crashes Jeff Dike
@ 2004-02-26 19:54 ` William Stearns
  0 siblings, 0 replies; 2+ messages in thread
From: William Stearns @ 2004-02-26 19:54 UTC (permalink / raw)
  To: Jeff Dike; +Cc: blaisorblade_spam, ML-uml-devel, William Stearns

Good afternoon, Jeff,

On Thu, 26 Feb 2004, Jeff Dike wrote:

> The patch below fixes the 2.6 scheduler crash.  It's another subtle signal
> problem.  This one is caused by siglongjmp restoring the old signal mask
> before calling longjmp.  See the comment for the gory details.  I've had
> the stress test running for more than an hour with this patch.  Without it,
> it would be lucky to go for five minutes.

	Many thanks for the work!
	I've got one vm with 25 copies of "while /bin/true ; do /bin/true 
; done &" running and me repeatedly logging in over ssh from the outside 
world; the load is high, but no crashes.  The system even stays 
responsive.
	Only 25 minutes of testing so far, but it looks good.  I'd like to 
give a tentative thumbs up.
	For others hoping to try this patch out, you'll need to use the 
"-l" parameter to patch to apply this; the tabs in the actual source were 
replaced with 8 spaces in Jeff's patch.
	If you'd simply like a precompiled binary, the one I'm testing is 
at http://www.stearns.org/uml/ ; the config, System.map, and linux all end 
with norecurs-patch2{,.bz2}.
	Cheers,
	- Bill

---------------------------------------------------------------------------
        "It is easy to be blinded to the essential uselessness of
computers by the sense of accomplishment you get from getting them to
work at all."
        -- Douglas Adams
--------------------------------------------------------------------------
William Stearns (wstearns@pobox.com).  Mason, Buildkernel, freedups, p0f,
rsync-backup, ssh-keyinstall, dns-check, more at:   http://www.stearns.org
--------------------------------------------------------------------------



-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-02-26 20:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-26 18:01 [uml-devel] [PATCH] fix for 2.6 scheduler crashes Jeff Dike
2004-02-26 19:54 ` William Stearns

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.