public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
@ 2011-11-07 11:14 Ilya Zykov
  2011-11-07 11:58 ` Alan Cox
  0 siblings, 1 reply; 11+ messages in thread
From: Ilya Zykov @ 2011-11-07 11:14 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-kernel, Ilya Zykov

Function flush_to_ldisc() call disc->ops->receive_buf(),
without tty->buf.lock and with TTY_FLUSHING bit set.
If we have deferred TTY_FLUSHPENDING request,
another thread can grab tty->buf.lock, and
flush tty's buffer when receive_buf() use its.

Signed-off-by: Ilya Zykov <ilya@ilyx.ru>
---
diff -uprN -X ../../../dontdiff a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
--- a/drivers/tty/tty_buffer.c	2011-11-07 14:30:46.000000000 +0400
+++ b/drivers/tty/tty_buffer.c	2011-11-07 14:35:32.000000000 +0400
@@ -427,11 +427,15 @@ static void flush_to_ldisc(struct work_s
 				tty_buffer_free(tty, head);
 				continue;
 			}
-			/* Ldisc or user is trying to flush the buffers
-			   we are feeding to the ldisc, stop feeding the
-			   line discipline as we want to empty the queue */
-			if (test_bit(TTY_FLUSHPENDING, &tty->flags))
+			/* Ldisc or user is trying to flush the buffers.
+			   We may have a deferred request to flush the input buffer,
+	   		   if so pull the chain under the lock and empty the queue */
+			if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
+				__tty_buffer_flush(tty);
+				clear_bit(TTY_FLUSHPENDING, &tty->flags);
+				wake_up(&tty->read_wait);
 				break;
+			}
 			if (!tty->receive_room)
 				break;
 			if (count > tty->receive_room)
@@ -447,13 +451,6 @@ static void flush_to_ldisc(struct work_s
 		clear_bit(TTY_FLUSHING, &tty->flags);
 	}
 
-	/* We may have a deferred request to flush the input buffer,
-	   if so pull the chain under the lock and empty the queue */
-	if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
-		__tty_buffer_flush(tty);
-		clear_bit(TTY_FLUSHPENDING, &tty->flags);
-		wake_up(&tty->read_wait);
-	}
 	spin_unlock_irqrestore(&tty->buf.lock, flags);
 
 	tty_ldisc_deref(disc);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
  2011-11-07 11:14 PROBLEM: Race condition in tty buffer's function flush_to_ldisc() Ilya Zykov
@ 2011-11-07 11:58 ` Alan Cox
  2011-11-07 12:20   ` Ilya Zykov
  2011-11-07 12:35   ` Ilya Zykov
  0 siblings, 2 replies; 11+ messages in thread
From: Alan Cox @ 2011-11-07 11:58 UTC (permalink / raw)
  To: Ilya Zykov; +Cc: Greg Kroah-Hartman, linux-kernel

On Mon, 07 Nov 2011 15:14:34 +0400
Ilya Zykov <ilya@ilyx.ru> wrote:

> Function flush_to_ldisc() call disc->ops->receive_buf(),
> without tty->buf.lock and with TTY_FLUSHING bit set.

flush_to_ldisc is single threaded for a given tty. If you fail to
ensure that is the case everything breaks.

What cases does this occur ?

Also for the other patches do you have benchmarks yet ?

Alan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
  2011-11-07 11:58 ` Alan Cox
@ 2011-11-07 12:20   ` Ilya Zykov
  2011-11-07 13:06     ` Alan Cox
  2011-11-07 12:35   ` Ilya Zykov
  1 sibling, 1 reply; 11+ messages in thread
From: Ilya Zykov @ 2011-11-07 12:20 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-kernel

Alan Cox wrote:

> On Mon, 07 Nov 2011 15:14:34 +0400
> Ilya Zykov <ilya@ilyx.ru> wrote:
> 
>> Function flush_to_ldisc() call disc->ops->receive_buf(),
>> without tty->buf.lock and with TTY_FLUSHING bit set.
> 
> flush_to_ldisc is single threaded for a given tty. If you fail to
> ensure that is the case everything breaks.

I think so. But if add in flush_to_ldisc():

if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
	.....
} else
	prink...
In my syslog appear this prink on pty devices.

> 
> What cases does this occur ?
> 
> Also for the other patches do you have benchmarks yet ?
> 
> Alan
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
  2011-11-07 11:58 ` Alan Cox
  2011-11-07 12:20   ` Ilya Zykov
@ 2011-11-07 12:35   ` Ilya Zykov
  1 sibling, 0 replies; 11+ messages in thread
From: Ilya Zykov @ 2011-11-07 12:35 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-kernel, Ilya Zykov

Alan Cox wrote:

> On Mon, 07 Nov 2011 15:14:34 +0400
> Ilya Zykov <ilya@ilyx.ru> wrote:
> 
>> Function flush_to_ldisc() call disc->ops->receive_buf(),
>> without tty->buf.lock and with TTY_FLUSHING bit set.
> 
> flush_to_ldisc is single threaded for a given tty. If you fail to
> ensure that is the case everything breaks.


Why we need: "if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {"
if flush_to_ldisc is single threaded?
we can: set_bit(TTY_FLUSHING, &tty->flags)
without if() at all.

> 
> What cases does this occur ?
> 
> Also for the other patches do you have benchmarks yet ?
> 
> Alan
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
  2011-11-07 12:20   ` Ilya Zykov
@ 2011-11-07 13:06     ` Alan Cox
  2011-11-07 13:44       ` Ilya Zykov
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Cox @ 2011-11-07 13:06 UTC (permalink / raw)
  To: Ilya Zykov; +Cc: Greg Kroah-Hartman, linux-kernel

> if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
> 	.....
> } else
> 	prink...
> In my syslog appear this prink on pty devices.

Then we need to understand the call paths that did this. I suspect it
may be a bug in the hacks Linus made to n_tty. They've always bothered
me as they never looked correct.

Can you get traces to see if it is actually calling flush_to_ldisc
multithreaded on a given tty

> Why we need: "if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {"
> if flush_to_ldisc is single threaded?
> we can: set_bit(TTY_FLUSHING, &tty->flags)
> without if() at all.

It is single threaded with respect to itself (you can't have two
flush_to_ldisc on the same tty at once) but you can have a parallel
call to tty_buffer_flush. The tty_buffer_flush path needs to pick the
right approach reliably.

So the theory is

If TTY_FLUSHING is set then tty_buffer_flush lets the flush_to_ldisc do
the work. During this time we won't fluish a buffer under the ldisc.

If TTY_FLUSHING is not set then we will do the flush directly. As we
hold tty->buf.lock at that point the two cannot race as far as I can
see.

We are actually now probably at the point we could take a per tty mutex
on the flush_to_ldisc and use it to tidy up ldisc change, flush and
other things. So this code could welll be something worth simplifying
once the rest of the tty_lock() is cleaned out.

Alan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
  2011-11-07 13:06     ` Alan Cox
@ 2011-11-07 13:44       ` Ilya Zykov
  2011-11-07 14:50         ` Alan Cox
  0 siblings, 1 reply; 11+ messages in thread
From: Ilya Zykov @ 2011-11-07 13:44 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-kernel

Alan Cox wrote:

> 
>> Why we need: "if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {"
>> if flush_to_ldisc is single threaded?
>> we can: set_bit(TTY_FLUSHING, &tty->flags)
>> without if() at all.
> 
> It is single threaded with respect to itself (you can't have two
> flush_to_ldisc on the same tty at once) but you can have a parallel
> call to tty_buffer_flush. The tty_buffer_flush path needs to pick the
> right approach reliably.
> 


Of course I know about tty_buffer_flush(), it only read TTY_FLUSHING,
it can't change TTY_FLUSHING, if flush_to_ldisc() single threaded,
we can change TTY_FLUSHING only in one place in one time(in flush_to_ldisc()),
therefor we can use only "set_bit(TTY_FLUSHING, &tty->flags)" without test.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
  2011-11-07 13:44       ` Ilya Zykov
@ 2011-11-07 14:50         ` Alan Cox
  2011-11-07 14:52           ` Ilya Zykov
  2011-11-07 18:26           ` Ilya Zykov
  0 siblings, 2 replies; 11+ messages in thread
From: Alan Cox @ 2011-11-07 14:50 UTC (permalink / raw)
  To: Ilya Zykov; +Cc: Greg Kroah-Hartman, linux-kernel

> Of course I know about tty_buffer_flush(), it only read TTY_FLUSHING,
> it can't change TTY_FLUSHING, if flush_to_ldisc() single threaded,
> we can change TTY_FLUSHING only in one place in one time(in
> flush_to_ldisc()), therefor we can use only "set_bit(TTY_FLUSHING,
> &tty->flags)" without test.

Yes.. if you can pin down why in your testing you see the other case
sometimes being true.

Alan
 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
  2011-11-07 14:50         ` Alan Cox
@ 2011-11-07 14:52           ` Ilya Zykov
  2011-11-07 18:26           ` Ilya Zykov
  1 sibling, 0 replies; 11+ messages in thread
From: Ilya Zykov @ 2011-11-07 14:52 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-kernel

Alan Cox wrote:

>> Of course I know about tty_buffer_flush(), it only read TTY_FLUSHING,
>> it can't change TTY_FLUSHING, if flush_to_ldisc() single threaded,
>> we can change TTY_FLUSHING only in one place in one time(in
>> flush_to_ldisc()), therefor we can use only "set_bit(TTY_FLUSHING,
>> &tty->flags)" without test.
> 
> Yes.. if you can pin down why in your testing you see the other case
> sometimes being true.
> 
> Alan
>  
> 
> 

OK, I try.
Ilya.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
  2011-11-07 14:50         ` Alan Cox
  2011-11-07 14:52           ` Ilya Zykov
@ 2011-11-07 18:26           ` Ilya Zykov
  2011-11-07 18:39             ` Alan Cox
  1 sibling, 1 reply; 11+ messages in thread
From: Ilya Zykov @ 2011-11-07 18:26 UTC (permalink / raw)
  To: Alan Cox; +Cc: Greg Kroah-Hartman, linux-kernel

Alan Cox wrote:

>> Of course I know about tty_buffer_flush(), it only read TTY_FLUSHING,
>> it can't change TTY_FLUSHING, if flush_to_ldisc() single threaded,
>> we can change TTY_FLUSHING only in one place in one time(in
>> flush_to_ldisc()), therefor we can use only "set_bit(TTY_FLUSHING,
>> &tty->flags)" without test.
> 
> Yes.. if you can pin down why in your testing you see the other case
> sometimes being true.
> 
> Alan
>  

Nested call flush_to_ldisc() happened on different CPU only. It happens because
one side pty call "schedule_work(&tty->buf.work)" on one CPU from "tty_flip_buffer_push()",
the other side call "schedule_work(&tty->buf.work)" from n_tty's layer from "n_tty_set_room()"
on different CPU. It also can happen in interrupt if interrupt handle on 
different CPU. Schedule_work() schedule work on CPU which was called(IMHO).

For testing I use in xterminal: cat big_file
and flush_to_ldisc() with this patch:

diff -uprN -X ../../../dontdiff a/drivers/tty/tty_buffer.c c/drivers/tty/tty_buffer.c
--- a/drivers/tty/tty_buffer.c	2011-11-07 14:45:27.000000000 +0400
+++ c/drivers/tty/tty_buffer.c	2011-11-07 21:48:49.000000000 +0400
@@ -405,13 +405,14 @@ static void flush_to_ldisc(struct work_s
 		container_of(work, struct tty_struct, buf.work);
 	unsigned long 	flags;
 	struct tty_ldisc *disc;
+	static int mthread;
 
 	disc = tty_ldisc_ref(tty);
 	if (disc == NULL)	/*  !TTY_LDISC */
 		return;
 
 	spin_lock_irqsave(&tty->buf.lock, flags);
-
+	mthread = 0;
 	if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
 		struct tty_buffer *head;
 		while ((head = tty->buf.head) != NULL) {
@@ -445,6 +446,12 @@ static void flush_to_ldisc(struct work_s
 			spin_lock_irqsave(&tty->buf.lock, flags);
 		}
 		clear_bit(TTY_FLUSHING, &tty->flags);
+	} else {
+		mthread = 1;
+		printk(KERN_WARNING "Tty %s FLUSHING CPU %d.\n", tty->name, percpu_read(cpu_number));
+	}
+	if (mthread) {
+		printk(KERN_WARNING "Tty %s FLUSHING multithreaded CPU %d.\n", tty->name, percpu_read(cpu_number));
 	}
 
 	/* We may have a deferred request to flush the input buffer,

And get in my syslog:

Nov  7 21:34:13 serh kernel: [   76.323848] Tty ptm0 FLUSHING CPU 0.
Nov  7 21:34:13 serh kernel: [   76.323850] Tty ptm0 FLUSHING multithreaded CPU 0.
Nov  7 21:34:13 serh kernel: [   76.323856] Tty ptm0 FLUSHING CPU 0.
Nov  7 21:34:13 serh kernel: [   76.323857] Tty ptm0 FLUSHING multithreaded CPU 0.
Nov  7 21:34:13 serh kernel: [   76.323861] Tty ptm0 FLUSHING multithreaded CPU 1.
Nov  7 21:34:13 serh kernel: [   76.336022] Tty ptm0 FLUSHING CPU 0.
Nov  7 21:34:13 serh kernel: [   76.336024] Tty ptm0 FLUSHING multithreaded CPU 0.
Nov  7 21:34:13 serh kernel: [   76.336030] Tty ptm0 FLUSHING multithreaded CPU 1.
Nov  7 21:34:13 serh kernel: [   76.353134] Tty ptm0 FLUSHING CPU 0.
Nov  7 21:34:13 serh kernel: [   76.353136] Tty ptm0 FLUSHING multithreaded CPU 0.
Nov  7 21:34:13 serh kernel: [   76.353143] Tty ptm0 FLUSHING CPU 0.
Nov  7 21:34:13 serh kernel: [   76.353145] Tty ptm0 FLUSHING multithreaded CPU 0.
Nov  7 21:34:13 serh kernel: [   76.353148] Tty ptm0 FLUSHING multithreaded CPU 1.
......
......

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
  2011-11-07 18:26           ` Ilya Zykov
@ 2011-11-07 18:39             ` Alan Cox
  2011-11-07 18:41               ` Alan Cox
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Cox @ 2011-11-07 18:39 UTC (permalink / raw)
  To: Ilya Zykov; +Cc: Alan Cox, Greg Kroah-Hartman, linux-kernel

> Nested call flush_to_ldisc() happened on different CPU only. It happens because
> one side pty call "schedule_work(&tty->buf.work)" on one CPU from "tty_flip_buffer_push()",
> the other side call "schedule_work(&tty->buf.work)" from n_tty's layer from "n_tty_set_room()"
> on different CPU. It also can happen in interrupt if interrupt handle on 
> different CPU. Schedule_work() schedule work on CPU which was called(IMHO).

But the two sides are different ttys so that specific case shouldn't
matter.

Something is clearly happening because you trigger the trace but I don't
see why the paths you indicate cause it ? If you make mthread a cpu
struct variable and add a WARN_ON(1) in the error case do you get any
useful traces ?

Alan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PROBLEM: Race condition in tty buffer's function flush_to_ldisc().
  2011-11-07 18:39             ` Alan Cox
@ 2011-11-07 18:41               ` Alan Cox
  0 siblings, 0 replies; 11+ messages in thread
From: Alan Cox @ 2011-11-07 18:41 UTC (permalink / raw)
  To: Alan Cox; +Cc: Ilya Zykov, Alan Cox, Greg Kroah-Hartman, linux-kernel

> see why the paths you indicate cause it ? If you make mthread a cpu
> struct variable and add a WARN_ON(1) in the error case do you get any

I mean tty struct (ie tty->mthread) of course sorry

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-11-07 18:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-07 11:14 PROBLEM: Race condition in tty buffer's function flush_to_ldisc() Ilya Zykov
2011-11-07 11:58 ` Alan Cox
2011-11-07 12:20   ` Ilya Zykov
2011-11-07 13:06     ` Alan Cox
2011-11-07 13:44       ` Ilya Zykov
2011-11-07 14:50         ` Alan Cox
2011-11-07 14:52           ` Ilya Zykov
2011-11-07 18:26           ` Ilya Zykov
2011-11-07 18:39             ` Alan Cox
2011-11-07 18:41               ` Alan Cox
2011-11-07 12:35   ` Ilya Zykov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox