public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Success: tty_io flush_to_ldisc() error message triggered
@ 2006-07-22  2:58 Chuck Ebbert
  2006-07-22 14:02 ` Paul Fulghum
  2006-07-22 14:04 ` Paul Fulghum
  0 siblings, 2 replies; 10+ messages in thread
From: Chuck Ebbert @ 2006-07-22  2:58 UTC (permalink / raw)
  To: Paul Fulghum; +Cc: linux-kernel, Alan Cox

Using the patch below that you sent me, this message printed today on
my SMP system when terminating pppd:

        flush_to_ldisc - TTY_FLUSHING set, low_latency=0

And this bug is also in 2.6.17.4 - I forgot to reboot into 2.6.16.x
and the system locked up the same way 2.6.16 used to.  That was made
even more fun by the 'SysRq broken on SMP' bug fixed by a pending
2.6.17.7 patch...


--- 2.6.16.20-d4.orig/include/linux/tty.h
+++ 2.6.16.20-d4/include/linux/tty.h
@@ -266,6 +266,7 @@ struct tty_struct {
 #define TTY_PTY_LOCK 		16	/* pty private */
 #define TTY_NO_WRITE_SPLIT 	17	/* Preserve write boundaries to driver */
 #define TTY_HUPPED 		18	/* Post driver->hangup() */
+#define TTY_FLUSHING 		19	/* Flushing tty buffers to line discipline */
 
 #define TTY_WRITE_FLUSH(tty) tty_write_flush((tty))
 
--- 2.6.16.20-d4.orig/drivers/char/tty_io.c
+++ 2.6.16.20-d4/drivers/char/tty_io.c
@@ -2781,9 +2781,12 @@ static void flush_to_ldisc(void *private
 		return;
 
 	if (test_bit(TTY_DONT_FLIP, &tty->flags)) {
-		/*
-		 * Do it after the next timer tick:
-		 */
+		printk(KERN_ERR"flush_to_ldisc - TTY_DONT_FLIP set\n");
+		schedule_delayed_work(&tty->buf.work, 1);
+		goto out;
+	}
+	if (test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
+		printk(KERN_ERR"flush_to_ldisc - TTY_FLUSHING set, low_latency=%d\n", tty->low_latency);
 		schedule_delayed_work(&tty->buf.work, 1);
 		goto out;
 	}
@@ -2805,6 +2808,7 @@ static void flush_to_ldisc(void *private
 		tty_buffer_free(tty, tbuf);
 	}
 	spin_unlock_irqrestore(&tty->buf.lock, flags);
+	clear_bit(TTY_FLUSHING, &tty->flags);
 out:
 	tty_ldisc_deref(disc);
 }
-- 
Chuck

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Success: tty_io flush_to_ldisc() error message triggered
  2006-07-22  2:58 Success: tty_io flush_to_ldisc() error message triggered Chuck Ebbert
@ 2006-07-22 14:02 ` Paul Fulghum
  2006-07-22 14:04 ` Paul Fulghum
  1 sibling, 0 replies; 10+ messages in thread
From: Paul Fulghum @ 2006-07-22 14:02 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel, Alan Cox

Chuck Ebbert wrote:
> Using the patch below that you sent me, this message printed today on
> my SMP system when terminating pppd:
> 
>         flush_to_ldisc - TTY_FLUSHING set, low_latency=0
> 
> And this bug is also in 2.6.17.4 - I forgot to reboot into 2.6.16.x
> and the system locked up the same way 2.6.16 used to.  That was made
> even more fun by the 'SysRq broken on SMP' bug fixed by a pending
> 2.6.17.7 patch...

That confirms my thoughts on what went wrong:
multiple copies of the queued work (flush_to_ldisc)
running in parallel and corrupting the free buffer list.

A cleaner fix for this is already
in the 2.6.18 rc series.

Thanks,
Paul

--
Paul Fulghum
Microgate Systems, Ltd.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Success: tty_io flush_to_ldisc() error message triggered
  2006-07-22  2:58 Success: tty_io flush_to_ldisc() error message triggered Chuck Ebbert
  2006-07-22 14:02 ` Paul Fulghum
@ 2006-07-22 14:04 ` Paul Fulghum
  1 sibling, 0 replies; 10+ messages in thread
From: Paul Fulghum @ 2006-07-22 14:04 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel, Alan Cox

Chuck Ebbert wrote:
> Using the patch below that you sent me, this message printed today on
> my SMP system when terminating pppd:
> 
>         flush_to_ldisc - TTY_FLUSHING set, low_latency=0
> 
> And this bug is also in 2.6.17.4 - I forgot to reboot into 2.6.16.x
> and the system locked up the same way 2.6.16 used to.  That was made
> even more fun by the 'SysRq broken on SMP' bug fixed by a pending
> 2.6.17.7 patch...

That confirms my thoughts on what went wrong:
multiple copies of the queued work (flush_to_ldisc)
running in parallel and corrupting the free buffer list.

A cleaner fix for this is already
in the 2.6.18 rc series.

Thanks,
Paul

--
Paul Fulghum
Microgate Systems, Ltd.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Success: tty_io flush_to_ldisc() error message triggered
@ 2006-07-22 16:07 Chuck Ebbert
  2006-07-22 16:41 ` Paul Fulghum
  0 siblings, 1 reply; 10+ messages in thread
From: Chuck Ebbert @ 2006-07-22 16:07 UTC (permalink / raw)
  To: Paul Fulghum; +Cc: Alan Cox, linux-kernel, linux-stable

In-Reply-To: <44C2307C.9060607@microgate.com>

On Sat, 22 Jul 2006 09:04:44 -0500, Paul Fulghum wrote:

> That confirms my thoughts on what went wrong:
> multiple copies of the queued work (flush_to_ldisc)
> running in parallel and corrupting the free buffer list.
> 
> A cleaner fix for this is already
> in the 2.6.18 rc series.

The cleaner fix looks more intrusive, though.

Is this simpler change (what I'm running but without the warning
messages) the preferred fix for -stable?


From: Paul Fulghum <paulkf@microgate.com>

Serialize flush_to_ldisc() per-device. Fixes free list corruption
that causes lockup on SMP systems.

Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Acked-by: Chuck Ebbert <76306.1226@compuserve.com>

--- 2.6.16.20-d4.orig/include/linux/tty.h
+++ 2.6.16.20-d4/include/linux/tty.h
@@ -266,6 +266,7 @@ struct tty_struct {
 #define TTY_PTY_LOCK 		16	/* pty private */
 #define TTY_NO_WRITE_SPLIT 	17	/* Preserve write boundaries to driver */
 #define TTY_HUPPED 		18	/* Post driver->hangup() */
+#define TTY_FLUSHING 		19	/* Flushing tty buffers to line discipline */
 
 #define TTY_WRITE_FLUSH(tty) tty_write_flush((tty))
 
--- 2.6.16.20-d4.orig/drivers/char/tty_io.c
+++ 2.6.16.20-d4/drivers/char/tty_io.c
@@ -2780,10 +2780,8 @@ static void flush_to_ldisc(void *private
 	if (disc == NULL)	/*  !TTY_LDISC */
 		return;
 
-	if (test_bit(TTY_DONT_FLIP, &tty->flags)) {
-		/*
-		 * Do it after the next timer tick:
-		 */
+	if (test_bit(TTY_DONT_FLIP, &tty->flags) ||
+	    test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
 		schedule_delayed_work(&tty->buf.work, 1);
 		goto out;
 	}
@@ -2805,6 +2803,7 @@ static void flush_to_ldisc(void *private
 		tty_buffer_free(tty, tbuf);
 	}
 	spin_unlock_irqrestore(&tty->buf.lock, flags);
+	clear_bit(TTY_FLUSHING, &tty->flags);
 out:
 	tty_ldisc_deref(disc);
 }
-- 
Chuck

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Success: tty_io flush_to_ldisc() error message triggered
  2006-07-22 16:07 Success: tty_io flush_to_ldisc() error message triggered Chuck Ebbert
@ 2006-07-22 16:41 ` Paul Fulghum
  2006-07-25 18:41   ` [stable] " Greg KH
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Fulghum @ 2006-07-22 16:41 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: Alan Cox, linux-kernel, linux-stable

Chuck Ebbert wrote:
> The cleaner fix looks more intrusive, though.
> 
> Is this simpler change (what I'm running but without the warning
> messages) the preferred fix for -stable?

It fixes the problem.

The ugly thing about that patch is adding
another TTY_XXX macro that is not really necessary.
I created it as a quick and dirty way of testing
my hypothesis about the problem.

I don't see a problem using it as a simple
(albeit naive) approach to fix stable.

--
Paul Fulghum
Microgate Systems, Ltd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [stable] Success: tty_io flush_to_ldisc() error message triggered
  2006-07-22 16:41 ` Paul Fulghum
@ 2006-07-25 18:41   ` Greg KH
  2006-07-25 19:12     ` Paul Fulghum
  0 siblings, 1 reply; 10+ messages in thread
From: Greg KH @ 2006-07-25 18:41 UTC (permalink / raw)
  To: Paul Fulghum; +Cc: Chuck Ebbert, linux-kernel, Alan Cox, linux-stable

On Sat, Jul 22, 2006 at 11:41:44AM -0500, Paul Fulghum wrote:
> Chuck Ebbert wrote:
> > The cleaner fix looks more intrusive, though.
> > 
> > Is this simpler change (what I'm running but without the warning
> > messages) the preferred fix for -stable?
> 
> It fixes the problem.

So do you feel this patch should be added to the -stable kernel tree?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [stable] Success: tty_io flush_to_ldisc() error message triggered
  2006-07-25 18:41   ` [stable] " Greg KH
@ 2006-07-25 19:12     ` Paul Fulghum
  2006-07-26  7:16       ` Greg KH
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Fulghum @ 2006-07-25 19:12 UTC (permalink / raw)
  To: Greg KH; +Cc: Chuck Ebbert, linux-kernel, Alan Cox, linux-stable

Greg KH wrote:
> On Sat, Jul 22, 2006 at 11:41:44AM -0500, Paul Fulghum wrote:
> 
>>Chuck Ebbert wrote:
>>
>>>The cleaner fix looks more intrusive, though.
>>>
>>>Is this simpler change (what I'm running but without the warning
>>>messages) the preferred fix for -stable?
>>
>>It fixes the problem.
> 
> 
> So do you feel this patch should be added to the -stable kernel tree?

No. Now that I think about it, adding that extra
macro is just wrong even if temporary.

The real fix is equally simple, but in 2.6.18-rc
it is intertwined with other more intrusive changes.

Let me make a new separate patch that does things
the right way, which is simply removing the list
head while processing the list so two instances
to not trip over each other. I would have done so
earlier, but I've been insanely busy with multiple
work related deadlines (lame excuse I know).

I should post something tomorrow afternoon.

-- 
Paul Fulghum
Microgate Systems, Ltd.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [stable] Success: tty_io flush_to_ldisc() error message triggered
  2006-07-25 19:12     ` Paul Fulghum
@ 2006-07-26  7:16       ` Greg KH
       [not found]         ` <1153941029.6903.5.camel@amdx2.microgate.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Greg KH @ 2006-07-26  7:16 UTC (permalink / raw)
  To: Paul Fulghum; +Cc: Alan Cox, linux-kernel, Chuck Ebbert, linux-stable

On Tue, Jul 25, 2006 at 02:12:28PM -0500, Paul Fulghum wrote:
> Greg KH wrote:
> > On Sat, Jul 22, 2006 at 11:41:44AM -0500, Paul Fulghum wrote:
> > 
> >>Chuck Ebbert wrote:
> >>
> >>>The cleaner fix looks more intrusive, though.
> >>>
> >>>Is this simpler change (what I'm running but without the warning
> >>>messages) the preferred fix for -stable?
> >>
> >>It fixes the problem.
> > 
> > 
> > So do you feel this patch should be added to the -stable kernel tree?
> 
> No. Now that I think about it, adding that extra
> macro is just wrong even if temporary.
> 
> The real fix is equally simple, but in 2.6.18-rc
> it is intertwined with other more intrusive changes.
> 
> Let me make a new separate patch that does things
> the right way, which is simply removing the list
> head while processing the list so two instances
> to not trip over each other. I would have done so
> earlier, but I've been insanely busy with multiple
> work related deadlines (lame excuse I know).
> 
> I should post something tomorrow afternoon.

Ok, we can wait, I'd rather have the proper fix instead of the band-aid.

Just send it to stable@kernel.org when you have something that you feel
comfortable with.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH]: tty buffering limit
       [not found]         ` <1153941029.6903.5.camel@amdx2.microgate.com>
@ 2006-07-28 13:53           ` Alan Cox
  2006-07-28 15:36             ` Paul Fulghum
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Cox @ 2006-07-28 13:53 UTC (permalink / raw)
  To: Paul Fulghum; +Cc: linux-kernel

Certain people seem to have assumed tty->throttled was 'advisory'. In
the absence of tty->author->throttle(), it seems we should keep a simple
limit of our own to avoid problems when this occurs. 

Signed-off-by: Alan Cox <alan@redhat.com>


diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.vanilla-2.6.18-rc2-mm1/drivers/char/tty_io.c linux-2.6.18-rc2-mm1/drivers/char/tty_io.c
--- linux.vanilla-2.6.18-rc2-mm1/drivers/char/tty_io.c	2006-07-27 16:19:51.000000000 +0100
+++ linux-2.6.18-rc2-mm1/drivers/char/tty_io.c	2006-07-27 16:44:17.000000000 +0100
@@ -247,6 +247,7 @@
 		kfree(thead);
 	}
 	tty->buf.tail = NULL;
+	tty->buf.memory_used = 0;
 }
 
 static void tty_buffer_init(struct tty_struct *tty)
@@ -255,11 +256,16 @@
 	tty->buf.head = NULL;
 	tty->buf.tail = NULL;
 	tty->buf.free = NULL;
+	tty->buf.memory_used = 0;
 }
 
-static struct tty_buffer *tty_buffer_alloc(size_t size)
+static struct tty_buffer *tty_buffer_alloc(struct tty_struct *tty, size_t size)
 {
-	struct tty_buffer *p = kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
+	struct tty_buffer *p;
+
+	if (tty->buf.memory_used + size > 65536)
+		return NULL;
+	p = kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
 	if(p == NULL)
 		return NULL;
 	p->used = 0;
@@ -269,7 +275,7 @@
 	p->read = 0;
 	p->char_buf_ptr = (char *)(p->data);
 	p->flag_buf_ptr = (unsigned char *)p->char_buf_ptr + size;
-/* 	printk("Flip create %p\n", p); */
+	tty->buf.memory_used += size;
 	return p;
 }
 
@@ -279,7 +285,9 @@
 static void tty_buffer_free(struct tty_struct *tty, struct tty_buffer *b)
 {
 	/* Dumb strategy for now - should keep some stats */
-/* 	printk("Flip dispose %p\n", b); */
+	tty->buf.memory_used -= b->size;
+	WARN_ON(tty->buf.memory_used < 0);
+
 	if(b->size >= 512)
 		kfree(b);
 	else {
@@ -299,16 +307,14 @@
 			t->used = 0;
 			t->commit = 0;
 			t->read = 0;
-			/* DEBUG ONLY */
-			memset(t->data, '*', size);
-/* 			printk("Flip recycle %p\n", t); */
+			tty->buf.memory_used += t->size;
 			return t;
 		}
 		tbh = &((*tbh)->next);
 	}
 	/* Round the buffer size out */
 	size = (size + 0xFF) & ~ 0xFF;
-	return tty_buffer_alloc(size);
+	return tty_buffer_alloc(tty, size);
 	/* Should possibly check if this fails for the largest buffer we
 	   have queued and recycle that ? */
 }
diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.vanilla-2.6.18-rc2-mm1/include/linux/tty.h linux-2.6.18-rc2-mm1/include/linux/tty.h
--- linux.vanilla-2.6.18-rc2-mm1/include/linux/tty.h	2006-07-27 16:19:26.000000000 +0100
+++ linux-2.6.18-rc2-mm1/include/linux/tty.h	2006-07-27 16:45:37.000000000 +0100
@@ -59,6 +59,7 @@
 	struct tty_buffer *head;	/* Queue head */
 	struct tty_buffer *tail;	/* Active buffer */
 	struct tty_buffer *free;	/* Free queue head */
+	int memory_used;		/* Buffer space used excluding free queue */
 };
 /*
  * The pty uses char_buf and flag_buf as a contiguous buffer


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH]: tty buffering limit
  2006-07-28 13:53           ` [PATCH]: tty buffering limit Alan Cox
@ 2006-07-28 15:36             ` Paul Fulghum
  0 siblings, 0 replies; 10+ messages in thread
From: Paul Fulghum @ 2006-07-28 15:36 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

Alan Cox wrote:
> Certain people seem to have assumed tty->throttled was 'advisory'. In
> the absence of tty->author->throttle(), it seems we should keep a simple
> limit of our own to avoid problems when this occurs. 

Looks reasonable as a general failsafe. There
may be other pathological cases it protects against.

I'll start banging on it.

-- 
Paul Fulghum
Microgate Systems, Ltd.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-07-28 15:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-22 16:07 Success: tty_io flush_to_ldisc() error message triggered Chuck Ebbert
2006-07-22 16:41 ` Paul Fulghum
2006-07-25 18:41   ` [stable] " Greg KH
2006-07-25 19:12     ` Paul Fulghum
2006-07-26  7:16       ` Greg KH
     [not found]         ` <1153941029.6903.5.camel@amdx2.microgate.com>
2006-07-28 13:53           ` [PATCH]: tty buffering limit Alan Cox
2006-07-28 15:36             ` Paul Fulghum
  -- strict thread matches above, loose matches on Subject: below --
2006-07-22  2:58 Success: tty_io flush_to_ldisc() error message triggered Chuck Ebbert
2006-07-22 14:02 ` Paul Fulghum
2006-07-22 14:04 ` Paul Fulghum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox