public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3.12] Broken terminal due to echo bufferring
@ 2013-12-09  2:55 Mikulas Patocka
  2013-12-09  4:26 ` Karl Dahlke
  2013-12-09 15:43 ` Peter Hurley
  0 siblings, 2 replies; 12+ messages in thread
From: Mikulas Patocka @ 2013-12-09  2:55 UTC (permalink / raw)
  To: Peter Hurley, Greg Kroah-Hartman, Jiri Slaby; +Cc: linux-kernel

Hi

I discovered that kernel 3.12 has broken terminal handling.

I created this program to show the problem:
#include <stdio.h>
#include <unistd.h>

int main(void)
{
        int c;
        while ((c = getchar()) != EOF) {
                if (c == '\n') write(1, "prompt>", 7);
        }
        return 0;
}

Each time the user presses enter, the program prints "prompt>". Normally, 
when you press enter, you should see:

prompt>
prompt>
prompt>
prompt>_

However, with kernel 3.12.4, you occasionally see

prompt>
prompt>
prompt>prompt>
_

This bug happens randomly, it is timing-dependent. I am using single-core 
600MHz processor with preemptible kernel, the bug may or may not happen on 
other computers.

This bug is caused by Peter Hurley's echo buffering patches 
(cbfd0340ae1993378fd47179db949e050e16e697). The patches change n_tty.c so 
that it accumulates echoed characters and sends them out in a batch. 
Something like this happens:

* The user presses enter
* n_tty.c adds '\n' to the echo buffer using echo_char_raw
* n_tty.c adds '\n' to the input queue using put_tty_queue
* A process is switched
* Userspace reads '\n' from the terminal input queue
* Userspace writes the string "prompt>" to the terminal
* A process is switched back
* The echo buffer is flushed
* '\n' from the echo buffer is printed.


Echo bufferring is fundamentally wrong idea - you must make sure that you 
flush the echo buffer BEFORE you add a character to input queue and BEFORE 
you send any signal on behalf of that character. If you delay echo, you 
are breaking behavior of various programs because the program output will 
be interleaved with the echoed characters.

Here I'm sending a simple patch that disables echo buffering and restores 
correct behavior. I think you should remove the echo buffering code at 
all.

Mikulas

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com

---
 drivers/tty/n_tty.c |   15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

Index: linux-3.12.4/drivers/tty/n_tty.c
===================================================================
--- linux-3.12.4.orig/drivers/tty/n_tty.c	2013-12-09 03:05:49.756728836 +0100
+++ linux-3.12.4/drivers/tty/n_tty.c	2013-12-09 03:21:10.785156100 +0100
@@ -783,21 +783,10 @@ static size_t __process_echoes(struct tt
 static void commit_echoes(struct tty_struct *tty)
 {
 	struct n_tty_data *ldata = tty->disc_data;
-	size_t nr, old, echoed;
-	size_t head;
-
-	head = ldata->echo_head;
-	old = ldata->echo_commit - ldata->echo_tail;
-
-	/* Process committed echoes if the accumulated # of bytes
-	 * is over the threshold (and try again each time another
-	 * block is accumulated) */
-	nr = head - ldata->echo_tail;
-	if (nr < ECHO_COMMIT_WATERMARK || (nr % ECHO_BLOCK > old % ECHO_BLOCK))
-		return;
+	size_t echoed;
 
 	mutex_lock(&ldata->output_lock);
-	ldata->echo_commit = head;
+	ldata->echo_commit = ldata->echo_head;
 	echoed = __process_echoes(tty);
 	mutex_unlock(&ldata->output_lock);
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-09  2:55 [PATCH 3.12] Broken terminal due to echo bufferring Mikulas Patocka
@ 2013-12-09  4:26 ` Karl Dahlke
  2013-12-09 15:43 ` Peter Hurley
  1 sibling, 0 replies; 12+ messages in thread
From: Karl Dahlke @ 2013-12-09  4:26 UTC (permalink / raw)
  To: mpatocka, peter, gregkh, jslaby; +Cc: linux-kernel

> I discovered that kernel 3.12 has broken terminal handling.
> I created this program to show the problem:
> ...

This is very likely the same bug I reported, with subject:

Bug in tty cooked mode kernel 3.12.3

You don't have to write a program for it, just put bash into cooked mode.
I demonstrated the bug by switching consoles and hitting return,
assuming bash has been placed in cooked mode.
It's very repeatable this way, not so random.
Switch console,hit return, the prompt is on the same line as the previous prompt.
It affects every cooked mode program, and of course your compiled program 
comes up in cooked mode.
It is cooked echo crlf that is the issue.
Programs in raw mode, like ash default in readline mode,
vi, emacs, etc,are ok.
So you don't see it right away.

See the thread with the aforementioned subject line.

I have applied two separate patches hoping to fix this problem,
but it remains.

Karl Dahlke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-09  2:55 [PATCH 3.12] Broken terminal due to echo bufferring Mikulas Patocka
  2013-12-09  4:26 ` Karl Dahlke
@ 2013-12-09 15:43 ` Peter Hurley
  2013-12-09 22:18   ` Mikulas Patocka
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Hurley @ 2013-12-09 15:43 UTC (permalink / raw)
  To: Mikulas Patocka, Greg Kroah-Hartman, Jiri Slaby; +Cc: linux-kernel

On 12/08/2013 09:55 PM, Mikulas Patocka wrote:
> Hi
>
> I discovered that kernel 3.12 has broken terminal handling.
>
> I created this program to show the problem:
> #include <stdio.h>
> #include <unistd.h>
>
> int main(void)
> {
>          int c;
>          while ((c = getchar()) != EOF) {
>                  if (c == '\n') write(1, "prompt>", 7);
>          }
>          return 0;
> }
>
> Each time the user presses enter, the program prints "prompt>". Normally,
> when you press enter, you should see:
>
> prompt>
> prompt>
> prompt>
> prompt>_
>
> However, with kernel 3.12.4, you occasionally see
>
> prompt>
> prompt>
> prompt>prompt>
> _
>
> This bug happens randomly, it is timing-dependent. I am using single-core
> 600MHz processor with preemptible kernel, the bug may or may not happen on
> other computers.
>
> This bug is caused by Peter Hurley's echo buffering patches
> (cbfd0340ae1993378fd47179db949e050e16e697). The patches change n_tty.c so
> that it accumulates echoed characters and sends them out in a batch.
> Something like this happens:
>
> * The user presses enter
> * n_tty.c adds '\n' to the echo buffer using echo_char_raw
> * n_tty.c adds '\n' to the input queue using put_tty_queue
> * A process is switched
> * Userspace reads '\n' from the terminal input queue
> * Userspace writes the string "prompt>" to the terminal
> * A process is switched back
> * The echo buffer is flushed
> * '\n' from the echo buffer is printed.
>
>
> Echo bufferring is fundamentally wrong idea - you must make sure that you
> flush the echo buffer BEFORE you add a character to input queue and BEFORE
> you send any signal on behalf of that character. If you delay echo, you
> are breaking behavior of various programs because the program output will
> be interleaved with the echoed characters.

There is nothing fundamentally broken with buffering echoes; it's just that
there is a bug wrt when to process the echoes (ie, when to force the output).

In the example you provided, the write() should cause the echoes to flush
but doesn't because the commit marker hasn't been advanced.

The commit marker wasn't advanced _yet_ because there is a race window between
the reader being woken as a result of the newline and the flush_echoes()
which happens with every received input.

Either closing the race window or advancing the commit marker prior to
write() output will fix the problem; right now, I'm looking at which is least
painful.

Regards,
Peter Hurley




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-09 15:43 ` Peter Hurley
@ 2013-12-09 22:18   ` Mikulas Patocka
  2013-12-10  0:01     ` Peter Hurley
  0 siblings, 1 reply; 12+ messages in thread
From: Mikulas Patocka @ 2013-12-09 22:18 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Karl Dahlke



On Mon, 9 Dec 2013, Peter Hurley wrote:

> On 12/08/2013 09:55 PM, Mikulas Patocka wrote:
> > Hi
> > 
> > I discovered that kernel 3.12 has broken terminal handling.
> > 
> > I created this program to show the problem:
> > #include <stdio.h>
> > #include <unistd.h>
> > 
> > int main(void)
> > {
> >          int c;
> >          while ((c = getchar()) != EOF) {
> >                  if (c == '\n') write(1, "prompt>", 7);
> >          }
> >          return 0;
> > }
> > 
> > Each time the user presses enter, the program prints "prompt>". Normally,
> > when you press enter, you should see:
> > 
> > prompt>
> > prompt>
> > prompt>
> > prompt>_
> > 
> > However, with kernel 3.12.4, you occasionally see
> > 
> > prompt>
> > prompt>
> > prompt>prompt>
> > _
> > 
> > This bug happens randomly, it is timing-dependent. I am using single-core
> > 600MHz processor with preemptible kernel, the bug may or may not happen on
> > other computers.
> > 
> > This bug is caused by Peter Hurley's echo buffering patches
> > (cbfd0340ae1993378fd47179db949e050e16e697). The patches change n_tty.c so
> > that it accumulates echoed characters and sends them out in a batch.
> > Something like this happens:
> > 
> > * The user presses enter
> > * n_tty.c adds '\n' to the echo buffer using echo_char_raw
> > * n_tty.c adds '\n' to the input queue using put_tty_queue
> > * A process is switched
> > * Userspace reads '\n' from the terminal input queue
> > * Userspace writes the string "prompt>" to the terminal
> > * A process is switched back
> > * The echo buffer is flushed
> > * '\n' from the echo buffer is printed.
> > 
> > 
> > Echo bufferring is fundamentally wrong idea - you must make sure that you
> > flush the echo buffer BEFORE you add a character to input queue and BEFORE
> > you send any signal on behalf of that character. If you delay echo, you
> > are breaking behavior of various programs because the program output will
> > be interleaved with the echoed characters.
> 
> There is nothing fundamentally broken with buffering echoes; it's just that
> there is a bug wrt when to process the echoes (ie, when to force the output).
> 
> In the example you provided, the write() should cause the echoes to flush
> but doesn't because the commit marker hasn't been advanced.
> 
> The commit marker wasn't advanced _yet_ because there is a race window between
> the reader being woken as a result of the newline and the flush_echoes()
> which happens with every received input.
> 
> Either closing the race window or advancing the commit marker prior to
> write() output will fix the problem; right now, I'm looking at which is least
> painful.
> 
> Regards,
> Peter Hurley

I still think you should drop this.


The user types on the keyboard so slowly, that lock contention doesn't 
matter. Specialized programs that use terminal to transmit bulk data don't 
use cooked mode and echo. So I don't really see any use case where 
something depends on performance of echoed characters.


Those patches just complicate the code for no benefit.


When you read a variable that may be asynchronously modified, you need 
ACCESS_ONCE - for example you need this in process_echoes when accessing 
the variables outside the lock:
if (ACCESS_ONCE(ldata->echo_commit) == ACCESS_ONCE(ldata->echo_tail))

Anyway accessing variables that may change without locks or barriers is 
generally bad idea and it is hard to verify it. Terminal layer is not 
performance-sensitive part of the kernel, so it isn't justified to use 
such dirty tricks.


Another problem: what about this in process_echoes and flush_echoes?
if (!L_ECHO(tty) || ldata->echo_commit == ldata->echo_tail)
	return;
- so if the user turns off echo, echoes are not performed. But the buffer 
is not flushed. So when the user turns on echo again, previously buffered 
characters will be echoed. That is wrong.


Mikulas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-09 22:18   ` Mikulas Patocka
@ 2013-12-10  0:01     ` Peter Hurley
  2013-12-10  2:29       ` Mikulas Patocka
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Hurley @ 2013-12-10  0:01 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Karl Dahlke

On 12/09/2013 05:18 PM, Mikulas Patocka wrote:
>
>
> On Mon, 9 Dec 2013, Peter Hurley wrote:
>
>> On 12/08/2013 09:55 PM, Mikulas Patocka wrote:
>>> Hi
>>>
>>> I discovered that kernel 3.12 has broken terminal handling.
>>>
>>> I created this program to show the problem:
>>> #include <stdio.h>
>>> #include <unistd.h>
>>>
>>> int main(void)
>>> {
>>>           int c;
>>>           while ((c = getchar()) != EOF) {
>>>                   if (c == '\n') write(1, "prompt>", 7);
>>>           }
>>>           return 0;
>>> }
>>>
>>> Each time the user presses enter, the program prints "prompt>". Normally,
>>> when you press enter, you should see:
>>>
>>> prompt>
>>> prompt>
>>> prompt>
>>> prompt>_
>>>
>>> However, with kernel 3.12.4, you occasionally see
>>>
>>> prompt>
>>> prompt>
>>> prompt>prompt>
>>> _
>>>
>>> This bug happens randomly, it is timing-dependent. I am using single-core
>>> 600MHz processor with preemptible kernel, the bug may or may not happen on
>>> other computers.
>>>
>>> This bug is caused by Peter Hurley's echo buffering patches
>>> (cbfd0340ae1993378fd47179db949e050e16e697). The patches change n_tty.c so
>>> that it accumulates echoed characters and sends them out in a batch.
>>> Something like this happens:
>>>
>>> * The user presses enter
>>> * n_tty.c adds '\n' to the echo buffer using echo_char_raw
>>> * n_tty.c adds '\n' to the input queue using put_tty_queue
>>> * A process is switched
>>> * Userspace reads '\n' from the terminal input queue
>>> * Userspace writes the string "prompt>" to the terminal
>>> * A process is switched back
>>> * The echo buffer is flushed
>>> * '\n' from the echo buffer is printed.
>>>
>>>
>>> Echo bufferring is fundamentally wrong idea - you must make sure that you
>>> flush the echo buffer BEFORE you add a character to input queue and BEFORE
>>> you send any signal on behalf of that character. If you delay echo, you
>>> are breaking behavior of various programs because the program output will
>>> be interleaved with the echoed characters.
>>
>> There is nothing fundamentally broken with buffering echoes; it's just that
>> there is a bug wrt when to process the echoes (ie, when to force the output).
>>
>> In the example you provided, the write() should cause the echoes to flush
>> but doesn't because the commit marker hasn't been advanced.
>>
>> The commit marker wasn't advanced _yet_ because there is a race window between
>> the reader being woken as a result of the newline and the flush_echoes()
>> which happens with every received input.
>>
>> Either closing the race window or advancing the commit marker prior to
>> write() output will fix the problem; right now, I'm looking at which is least
>> painful.
>>
>> Regards,
>> Peter Hurley
>
> I still think you should drop this.
>
>
> The user types on the keyboard so slowly, that lock contention doesn't
> matter. Specialized programs that use terminal to transmit bulk data don't
> use cooked mode and echo. So I don't really see any use case where
> something depends on performance of echoed characters.
>
>
> Those patches just complicate the code for no benefit.
>
>
> When you read a variable that may be asynchronously modified, you need
> ACCESS_ONCE - for example you need this in process_echoes when accessing
> the variables outside the lock:
> if (ACCESS_ONCE(ldata->echo_commit) == ACCESS_ONCE(ldata->echo_tail))

Not necessarily. Stale values in an SMP environment may not be a problem;
in this case, a possibly stale echo_tail simply means that the output_lock
will be obtained unnecessarily (which is cheaper every-so-often than contending
over the echo_tail cache line every write, especially on x86 where there is
no problem).

Similarly, so many fences had to be passed to get to the echo_commit load
from userspace that performing a load-acquire here and store-release in
commit_echoes would be ridiculously superfluous.

> Anyway accessing variables that may change without locks or barriers is
> generally bad idea and it is hard to verify it. Terminal layer is not
> performance-sensitive part of the kernel, so it isn't justified to use
> such dirty tricks.
>
>
> Another problem: what about this in process_echoes and flush_echoes?
> if (!L_ECHO(tty) || ldata->echo_commit == ldata->echo_tail)
> 	return;
> - so if the user turns off echo, echoes are not performed. But the buffer
> is not flushed. So when the user turns on echo again, previously buffered
> characters will be echoed. That is wrong.

The check for !L_ECHO pre-dates my patches; it might be wrong but userspace
may have come to rely on this behavior. That said, feel free to submit a fix
for that, if you think it's broken.

Regards,
Peter Hurley


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-10  0:01     ` Peter Hurley
@ 2013-12-10  2:29       ` Mikulas Patocka
  2013-12-10 10:48         ` Karl Dahlke
                           ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Mikulas Patocka @ 2013-12-10  2:29 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Karl Dahlke



On Mon, 9 Dec 2013, Peter Hurley wrote:

> On 12/09/2013 05:18 PM, Mikulas Patocka wrote:
> > 
> > 
> > On Mon, 9 Dec 2013, Peter Hurley wrote:
> > 
> > > On 12/08/2013 09:55 PM, Mikulas Patocka wrote:
> > > > Hi
> > > > 
> > > > I discovered that kernel 3.12 has broken terminal handling.
> > > > 
> > > > I created this program to show the problem:
> > > > #include <stdio.h>
> > > > #include <unistd.h>
> > > > 
> > > > int main(void)
> > > > {
> > > >           int c;
> > > >           while ((c = getchar()) != EOF) {
> > > >                   if (c == '\n') write(1, "prompt>", 7);
> > > >           }
> > > >           return 0;
> > > > }
> > > > 
> > > > Each time the user presses enter, the program prints "prompt>".
> > > > Normally,
> > > > when you press enter, you should see:
> > > > 
> > > > prompt>
> > > > prompt>
> > > > prompt>
> > > > prompt>_
> > > > 
> > > > However, with kernel 3.12.4, you occasionally see
> > > > 
> > > > prompt>
> > > > prompt>
> > > > prompt>prompt>
> > > > _
> > > > 
> > > > This bug happens randomly, it is timing-dependent. I am using
> > > > single-core
> > > > 600MHz processor with preemptible kernel, the bug may or may not happen
> > > > on
> > > > other computers.
> > > > 
> > > > This bug is caused by Peter Hurley's echo buffering patches
> > > > (cbfd0340ae1993378fd47179db949e050e16e697). The patches change n_tty.c
> > > > so
> > > > that it accumulates echoed characters and sends them out in a batch.
> > > > Something like this happens:
> > > > 
> > > > * The user presses enter
> > > > * n_tty.c adds '\n' to the echo buffer using echo_char_raw
> > > > * n_tty.c adds '\n' to the input queue using put_tty_queue
> > > > * A process is switched
> > > > * Userspace reads '\n' from the terminal input queue
> > > > * Userspace writes the string "prompt>" to the terminal
> > > > * A process is switched back
> > > > * The echo buffer is flushed
> > > > * '\n' from the echo buffer is printed.
> > > > 
> > > > 
> > > > Echo bufferring is fundamentally wrong idea - you must make sure that
> > > > you
> > > > flush the echo buffer BEFORE you add a character to input queue and
> > > > BEFORE
> > > > you send any signal on behalf of that character. If you delay echo, you
> > > > are breaking behavior of various programs because the program output
> > > > will
> > > > be interleaved with the echoed characters.
> > > 
> > > There is nothing fundamentally broken with buffering echoes; it's just
> > > that
> > > there is a bug wrt when to process the echoes (ie, when to force the
> > > output).
> > > 
> > > In the example you provided, the write() should cause the echoes to flush
> > > but doesn't because the commit marker hasn't been advanced.
> > > 
> > > The commit marker wasn't advanced _yet_ because there is a race window
> > > between
> > > the reader being woken as a result of the newline and the flush_echoes()
> > > which happens with every received input.
> > > 
> > > Either closing the race window or advancing the commit marker prior to
> > > write() output will fix the problem; right now, I'm looking at which is
> > > least
> > > painful.
> > > 
> > > Regards,
> > > Peter Hurley
> > 
> > I still think you should drop this.
> > 
> > 
> > The user types on the keyboard so slowly, that lock contention doesn't
> > matter. Specialized programs that use terminal to transmit bulk data don't
> > use cooked mode and echo. So I don't really see any use case where
> > something depends on performance of echoed characters.
> > 
> > 
> > Those patches just complicate the code for no benefit.
> > 
> > 
> > When you read a variable that may be asynchronously modified, you need
> > ACCESS_ONCE - for example you need this in process_echoes when accessing
> > the variables outside the lock:
> > if (ACCESS_ONCE(ldata->echo_commit) == ACCESS_ONCE(ldata->echo_tail))
> 
> Not necessarily. Stale values in an SMP environment may not be a problem;
> in this case, a possibly stale echo_tail simply means that the output_lock
> will be obtained unnecessarily (which is cheaper every-so-often than
> contending
> over the echo_tail cache line every write, especially on x86 where there is
> no problem).

Note that a single lock doesn't imply memory barrier:
	read(variable_1);
	spin_lock(lock);
	spin_unlock(lock);
	read(variable_2);
may be reordered to
	spin_lock(lock);
	read(variable_2);
	read(variable_1);
	spin_unlock(lock);

Two lock do imply a memory barrier. Surely, you can argue that the system 
takes at least two locks between reading the input queue and writing to 
the output to compensate for the missing memory barrier. But depending on 
such guarantees is dirty.

What happens if I write the equivalent of the above program that reads 
'\n' and writes "prompt>" in the kernel space? Will there still be two 
locks between those operations? Will there be two locks always in the 
future?


Also note, that you need ACCESS_ONCE if the variable may change. The 
compiler may assume during optimizations that the variables that are read 
don't change.

The compiler may even generate something like this when you read variable 
"v":
	movl	v, %eax
	cmpl	v, %eax
	jne	nowhere
- of course it doesn't actually generate this code, but legally it could.
ACCESS_ONCE is there to prevent this assumption.


I suggest that you change commit_echoes to always write the buffer and 
flush it with tty->ops->flush_chars(tty). And then, you can drop 
process_echoes from n_tty_write. And then, there will be no asynchronous 
access to the buffer pointers.

> Similarly, so many fences had to be passed to get to the echo_commit load
> from userspace that performing a load-acquire here and store-release in
> commit_echoes would be ridiculously superfluous.
> 
> > Anyway accessing variables that may change without locks or barriers is
> > generally bad idea and it is hard to verify it. Terminal layer is not
> > performance-sensitive part of the kernel, so it isn't justified to use
> > such dirty tricks.
> > 
> > 
> > Another problem: what about this in process_echoes and flush_echoes?
> > if (!L_ECHO(tty) || ldata->echo_commit == ldata->echo_tail)
> > 	return;
> > - so if the user turns off echo, echoes are not performed. But the buffer
> > is not flushed. So when the user turns on echo again, previously buffered
> > characters will be echoed. That is wrong.
> 
> The check for !L_ECHO pre-dates my patches; it might be wrong but userspace
> may have come to rely on this behavior. That said, feel free to submit a fix
> for that, if you think it's broken.

We should just clear the buffer on !L_ECHO. Or maybe (once we get rid of 
the asynchronous buffer access) do not test here for L_ECHO at all - if 
L_ECHO isn't set, then nothing is appended to the buffer. Consequently we 
don't have to check for L_ECHO when we are flushing the buffer.

> Regards,
> Peter Hurley

Mikulas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-10  2:29       ` Mikulas Patocka
@ 2013-12-10 10:48         ` Karl Dahlke
  2013-12-10 11:00           ` Peter Hurley
  2013-12-10 10:52         ` Karl Dahlke
  2013-12-10 14:34         ` Peter Hurley
  2 siblings, 1 reply; 12+ messages in thread
From: Karl Dahlke @ 2013-12-10 10:48 UTC (permalink / raw)
  To: mpatocka, peter; +Cc: gregkh, jslaby, linux-kernel

| An involved discussion about race conditions and asynchronous events,
| which is beyond me.
| Please continue; I'm sure you will figure it out,
| and perhaps educate me along the way.
| But this thread began with the following program that revealed,
| I believe, the same echo crlf bug that I pointed out.

> #include <stdio.h>
> #include <unistd.h>
> 
> int main(void)
> {
> int c;
> while ((c = getchar()) != EOF) {
> if (c == '\n') write(1, "prompt>", 7);
> }
> return 0;
> }

Peter sent me a patch which fixed my bug, in its console switch form.
And also seemed to fix the bug whenever I was in a cooked mode program.
So I was happy.
For grins I compiled this program, to see if it also
ran properly.
It does, while it is running.
Prompt and newline come out in sequence, in order,
and everything looks right.
Hit control d, for EOF, as the program expects, and all is well.
But hit ^c for interrupt, and as Tilly says,
"All hell done broke loose now."
The tty spills out a bunch of accumulated text that it has displayed in the past.
Don't know where it comes from, or why.
I ran the program several times, interrupt,
and the same thing happened each time, even the same stored output,
as though it were a 4k block that was retained from somewhere.
Try it and see (with Peter's latest patches).

I may try some variants: this program running in raw mode,
using read and write so we don't have half of it using stdio,
other standard cooked mode programs that are ^c interruptable.

I guess the tty is an incredibly complicated beast.

Karl Dahlke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-10  2:29       ` Mikulas Patocka
  2013-12-10 10:48         ` Karl Dahlke
@ 2013-12-10 10:52         ` Karl Dahlke
  2013-12-10 14:34         ` Peter Hurley
  2 siblings, 0 replies; 12+ messages in thread
From: Karl Dahlke @ 2013-12-10 10:52 UTC (permalink / raw)
  To: mpatocka, peter; +Cc: gregkh, jslaby, linux-kernel

Just a followup to my last post.
You don't need to compile the write("prompt>") program to see the bug.
Just edit a modest text file with /bin/ed, which runs in cooked mode.
As soon as the file comes up, hit ^c.
Lots of old stuff comes spewing out.
Eventually the tty is back and you can edit the file as usual, or quit.

Karl Dahlke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-10 10:48         ` Karl Dahlke
@ 2013-12-10 11:00           ` Peter Hurley
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Hurley @ 2013-12-10 11:00 UTC (permalink / raw)
  To: Karl Dahlke, mpatocka; +Cc: gregkh, jslaby, linux-kernel

On 12/10/2013 05:48 AM, Karl Dahlke wrote:
> | An involved discussion about race conditions and asynchronous events,
> | which is beyond me.
> | Please continue; I'm sure you will figure it out,
> | and perhaps educate me along the way.
> | But this thread began with the following program that revealed,
> | I believe, the same echo crlf bug that I pointed out.
>
>> #include <stdio.h>
>> #include <unistd.h>
>>
>> int main(void)
>> {
>> int c;
>> while ((c = getchar()) != EOF) {
>> if (c == '\n') write(1, "prompt>", 7);
>> }
>> return 0;
>> }
>
> Peter sent me a patch which fixed my bug, in its console switch form.
> And also seemed to fix the bug whenever I was in a cooked mode program.
> So I was happy.
> For grins I compiled this program, to see if it also
> ran properly.
> It does, while it is running.
> Prompt and newline come out in sequence, in order,
> and everything looks right.
> Hit control d, for EOF, as the program expects, and all is well.
> But hit ^c for interrupt, and as Tilly says,
> "All hell done broke loose now."
> The tty spills out a bunch of accumulated text that it has displayed in the past.
> Don't know where it comes from, or why.
> I ran the program several times, interrupt,
> and the same thing happened each time, even the same stored output,
> as though it were a 4k block that was retained from somewhere.
> Try it and see (with Peter's latest patches).

Hi Karl,

That's happening because you're also running the 'tty: Fix ^C echo' patch
series which pre-dates this fix and is _not_ compatible with it.

I've already asked Greg to drop that series for other reasons (which he
had planned on anyway).

Regards,
Peter Hurley


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-10  2:29       ` Mikulas Patocka
  2013-12-10 10:48         ` Karl Dahlke
  2013-12-10 10:52         ` Karl Dahlke
@ 2013-12-10 14:34         ` Peter Hurley
  2013-12-11 16:29           ` Mikulas Patocka
  2 siblings, 1 reply; 12+ messages in thread
From: Peter Hurley @ 2013-12-10 14:34 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Karl Dahlke

On 12/09/2013 09:29 PM, Mikulas Patocka wrote:
>
>
> On Mon, 9 Dec 2013, Peter Hurley wrote:
>
>> On 12/09/2013 05:18 PM, Mikulas Patocka wrote:
>>>
>>>
>>> On Mon, 9 Dec 2013, Peter Hurley wrote:
>>>
>>>> On 12/08/2013 09:55 PM, Mikulas Patocka wrote:
>>>>> Hi
>>>>>
>>>>> I discovered that kernel 3.12 has broken terminal handling.
>>>>>
>>>>> I created this program to show the problem:
>>>>> #include <stdio.h>
>>>>> #include <unistd.h>
>>>>>
>>>>> int main(void)
>>>>> {
>>>>>            int c;
>>>>>            while ((c = getchar()) != EOF) {
>>>>>                    if (c == '\n') write(1, "prompt>", 7);
>>>>>            }
>>>>>            return 0;
>>>>> }
>>>>>
>>>>> Each time the user presses enter, the program prints "prompt>".
>>>>> Normally,
>>>>> when you press enter, you should see:
>>>>>
>>>>> prompt>
>>>>> prompt>
>>>>> prompt>
>>>>> prompt>_
>>>>>
>>>>> However, with kernel 3.12.4, you occasionally see
>>>>>
>>>>> prompt>
>>>>> prompt>
>>>>> prompt>prompt>
>>>>> _
>>>>>
>>>>> This bug happens randomly, it is timing-dependent. I am using
>>>>> single-core
>>>>> 600MHz processor with preemptible kernel, the bug may or may not happen
>>>>> on
>>>>> other computers.
>>>>>
>>>>> This bug is caused by Peter Hurley's echo buffering patches
>>>>> (cbfd0340ae1993378fd47179db949e050e16e697). The patches change n_tty.c
>>>>> so
>>>>> that it accumulates echoed characters and sends them out in a batch.
>>>>> Something like this happens:
>>>>>
>>>>> * The user presses enter
>>>>> * n_tty.c adds '\n' to the echo buffer using echo_char_raw
>>>>> * n_tty.c adds '\n' to the input queue using put_tty_queue
>>>>> * A process is switched
>>>>> * Userspace reads '\n' from the terminal input queue
>>>>> * Userspace writes the string "prompt>" to the terminal
>>>>> * A process is switched back
>>>>> * The echo buffer is flushed
>>>>> * '\n' from the echo buffer is printed.
>>>>>
>>>>>
>>>>> Echo bufferring is fundamentally wrong idea - you must make sure that
>>>>> you
>>>>> flush the echo buffer BEFORE you add a character to input queue and
>>>>> BEFORE
>>>>> you send any signal on behalf of that character. If you delay echo, you
>>>>> are breaking behavior of various programs because the program output
>>>>> will
>>>>> be interleaved with the echoed characters.
>>>>
>>>> There is nothing fundamentally broken with buffering echoes; it's just
>>>> that
>>>> there is a bug wrt when to process the echoes (ie, when to force the
>>>> output).
>>>>
>>>> In the example you provided, the write() should cause the echoes to flush
>>>> but doesn't because the commit marker hasn't been advanced.
>>>>
>>>> The commit marker wasn't advanced _yet_ because there is a race window
>>>> between
>>>> the reader being woken as a result of the newline and the flush_echoes()
>>>> which happens with every received input.
>>>>
>>>> Either closing the race window or advancing the commit marker prior to
>>>> write() output will fix the problem; right now, I'm looking at which is
>>>> least
>>>> painful.
>>>>
>>>> Regards,
>>>> Peter Hurley
>>>
>>> I still think you should drop this.
>>>
>>>
>>> The user types on the keyboard so slowly, that lock contention doesn't
>>> matter. Specialized programs that use terminal to transmit bulk data don't
>>> use cooked mode and echo. So I don't really see any use case where
>>> something depends on performance of echoed characters.
>>>
>>>
>>> Those patches just complicate the code for no benefit.
>>>
>>>
>>> When you read a variable that may be asynchronously modified, you need
>>> ACCESS_ONCE - for example you need this in process_echoes when accessing
>>> the variables outside the lock:
>>> if (ACCESS_ONCE(ldata->echo_commit) == ACCESS_ONCE(ldata->echo_tail))
>>
>> Not necessarily. Stale values in an SMP environment may not be a problem;
>> in this case, a possibly stale echo_tail simply means that the output_lock
>> will be obtained unnecessarily (which is cheaper every-so-often than
>> contending
>> over the echo_tail cache line every write, especially on x86 where there is
>> no problem).
>
> Note that a single lock doesn't imply memory barrier:
> 	read(variable_1);
> 	spin_lock(lock);
> 	spin_unlock(lock);
> 	read(variable_2);
> may be reordered to
> 	spin_lock(lock);
> 	read(variable_2);
> 	read(variable_1);
> 	spin_unlock(lock);
>
> Two lock do imply a memory barrier. Surely, you can argue that the system
> takes at least two locks between reading the input queue and writing to
> the output to compensate for the missing memory barrier. But depending on
> such guarantees is dirty.

To escape from n_tty_read() alone requires passing through (at a minimum)
1. UNLOCK rwsem
2. LOCK wq
3. UNLOCK wq
4. UNLOCK read serialization

> What happens if I write the equivalent of the above program that reads
> '\n' and writes "prompt>" in the kernel space? Will there still be two
> locks between those operations? Will there be two locks always in the
> future?

Out-of-tree breakage is common. Documentation/stable_api_nonsense.txt
explains why.

> Also note, that you need ACCESS_ONCE if the variable may change. The
> compiler may assume during optimizations that the variables that are read
> don't change.

Lots of (many? most?) kernel variables change asynchronously and are still
read without ACCESS_ONCE();  consider how waitqueue_active() works.

> The compiler may even generate something like this when you read variable
> "v":
> 	movl	v, %eax
> 	cmpl	v, %eax
> 	jne	nowhere
> - of course it doesn't actually generate this code, but legally it could.
> ACCESS_ONCE is there to prevent this assumption.

Not in this case. The compiler could not possibly prove the loads
are unnecessary: n_tty_write() is a globally visible call target.
(Besides, the load can't be optimized out because of the LOCK rwsem.)

> I suggest that you change commit_echoes to always write the buffer and
> flush it with tty->ops->flush_chars(tty). And then, you can drop
> process_echoes from n_tty_write. And then, there will be no asynchronous
> access to the buffer pointers.

process_echoes() cannot be dropped from n_tty_write() because, even without
block commits, the driver's output buffer may be full and unable to accept
more input. That's why the echo buffer exists.

>> Similarly, so many fences had to be passed to get to the echo_commit load
>> from userspace that performing a load-acquire here and store-release in
>> commit_echoes would be ridiculously superfluous.
>>
>>> Anyway accessing variables that may change without locks or barriers is
>>> generally bad idea and it is hard to verify it. Terminal layer is not
>>> performance-sensitive part of the kernel, so it isn't justified to use
>>> such dirty tricks.
>>>
>>>
>>> Another problem: what about this in process_echoes and flush_echoes?
>>> if (!L_ECHO(tty) || ldata->echo_commit == ldata->echo_tail)
>>> 	return;
>>> - so if the user turns off echo, echoes are not performed. But the buffer
>>> is not flushed. So when the user turns on echo again, previously buffered
>>> characters will be echoed. That is wrong.
>>
>> The check for !L_ECHO pre-dates my patches; it might be wrong but userspace
>> may have come to rely on this behavior. That said, feel free to submit a fix
>> for that, if you think it's broken.
>
> We should just clear the buffer on !L_ECHO. Or maybe (once we get rid of
> the asynchronous buffer access) do not test here for L_ECHO at all - if
> L_ECHO isn't set, then nothing is appended to the buffer. Consequently we
> don't have to check for L_ECHO when we are flushing the buffer.

Discarding the echo buffer with L_ECHO -> !L_ECHO changes will almost certainly
break something. I considered attempting to push the echo buffer when that change
happens in n_tty_set_termios() but simply haven't gotten to it for testing; and that
still wouldn't get rid of the need to check if echoes need to be pushed when !L_ECHO.

Regards,
Peter Hurley


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-10 14:34         ` Peter Hurley
@ 2013-12-11 16:29           ` Mikulas Patocka
  2013-12-13 21:24             ` Peter Hurley
  0 siblings, 1 reply; 12+ messages in thread
From: Mikulas Patocka @ 2013-12-11 16:29 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Karl Dahlke



On Tue, 10 Dec 2013, Peter Hurley wrote:

> On 12/09/2013 09:29 PM, Mikulas Patocka wrote:
> > 
> > 
> > On Mon, 9 Dec 2013, Peter Hurley wrote:
> > 
> > > On 12/09/2013 05:18 PM, Mikulas Patocka wrote:
> > > > 
> > > > I still think you should drop this.
> > > > 
> > > > 
> > > > The user types on the keyboard so slowly, that lock contention doesn't
> > > > matter. Specialized programs that use terminal to transmit bulk data
> > > > don't
> > > > use cooked mode and echo. So I don't really see any use case where
> > > > something depends on performance of echoed characters.
> > > > 
> > > > 
> > > > Those patches just complicate the code for no benefit.
> > > > 
> > > > 
> > > > When you read a variable that may be asynchronously modified, you need
> > > > ACCESS_ONCE - for example you need this in process_echoes when accessing
> > > > the variables outside the lock:
> > > > if (ACCESS_ONCE(ldata->echo_commit) == ACCESS_ONCE(ldata->echo_tail))
> > > 
> > > Not necessarily. Stale values in an SMP environment may not be a problem;
> > > in this case, a possibly stale echo_tail simply means that the output_lock
> > > will be obtained unnecessarily (which is cheaper every-so-often than
> > > contending
> > > over the echo_tail cache line every write, especially on x86 where there
> > > is
> > > no problem).
> > 
> > Note that a single lock doesn't imply memory barrier:
> > 	read(variable_1);
> > 	spin_lock(lock);
> > 	spin_unlock(lock);
> > 	read(variable_2);
> > may be reordered to
> > 	spin_lock(lock);
> > 	read(variable_2);
> > 	read(variable_1);
> > 	spin_unlock(lock);
> > 
> > Two lock do imply a memory barrier. Surely, you can argue that the system
> > takes at least two locks between reading the input queue and writing to
> > the output to compensate for the missing memory barrier. But depending on
> > such guarantees is dirty.
> 
> To escape from n_tty_read() alone requires passing through (at a minimum)
> 1. UNLOCK rwsem
> 2. LOCK wq
> 3. UNLOCK wq
> 4. UNLOCK read serialization

Sure, but you should just add the barriers if they are needed, don't rely 
on others doing barriers for you. Besides, there is another rw-semaphore 
implementation that doesn't have any barrier guarantees 
(include/linux/percpu-rwsem.h).

> > Also note, that you need ACCESS_ONCE if the variable may change. The
> > compiler may assume during optimizations that the variables that are read
> > don't change.
> 
> Lots of (many? most?) kernel variables change asynchronously and are still
> read without ACCESS_ONCE();  consider how waitqueue_active() works.

So they should be read with ACCESS_ONCE().

> > The compiler may even generate something like this when you read variable
> > "v":
> > 	movl	v, %eax
> > 	cmpl	v, %eax
> > 	jne	nowhere
> > - of course it doesn't actually generate this code, but legally it could.
> > ACCESS_ONCE is there to prevent this assumption.
> 
> Not in this case. The compiler could not possibly prove the loads
> are unnecessary: n_tty_write() is a globally visible call target.
> (Besides, the load can't be optimized out because of the LOCK rwsem.)

The compiler may assume that the variables that it is reading don't 
change. Usually it doesn't use this assumption (that why omitting 
ACCESS_ONCE in most cases doesn't result in any observable wrongdoing), 
but nonetheless, the compiler may use this assumption.

For example, if the compiler proves that action1() and action2() don't 
change the variable, it may transform this piece of code:
	if (variable) {
		action1();
		action2();
		action3();
	} else {
		action2();
	}
into this piece of code:
	if (variable)
		action1();
	action2();
	if (variable)
		action3();

- obviously, if the variable changes asynchronously, it results in 
unintended bahavior.

> > I suggest that you change commit_echoes to always write the buffer and
> > flush it with tty->ops->flush_chars(tty). And then, you can drop
> > process_echoes from n_tty_write. And then, there will be no asynchronous
> > access to the buffer pointers.
> 
> process_echoes() cannot be dropped from n_tty_write() because, even without
> block commits, the driver's output buffer may be full and unable to accept
> more input. That's why the echo buffer exists.

Let me ask you this question:

Does the function __process_echoes (in 3.12) or process_echoes (in 3.11) 
guarantee that the echo buffer is emptied and all characters in the buffer 
are sent to the terminal?

- if it has this guarantee, then you don't need to call that function in 
n_tty_write. It is just enough to call it before adding the character to 
the input queue.

- if it doesn't have this guarantee, then then the code is already buggy:
suppose for example this race condition:
1. the user presses enter
2. '\n' is added to the echo buffer
3. the echo buffer is not flushed because tty_write_room returns zero
4. the program reads '\n' from the input buffer
5. the program writes the string "prompt>" to the terminal
6. n_tty_write calls process_echoes, it still doesn't echo anything 
   because tty_write_room returns zero
7. the terminal driver makes some progress and makes some room in its 
   buffer so that tty_write_room no longer returns zero
8. n_tty_write writes the string "prompt>" while '\n' is still sitting in 
   the echo buffer

Another problem - if __process_echoes doesn't flush the echo buffer, who 
does flush it afterwards? You need to spawn a workqueue that waits on 
tty->write_wait and flushes the echo buffer when the terminal drivers 
makes some room.

> > > > Another problem: what about this in process_echoes and flush_echoes?
> > > > if (!L_ECHO(tty) || ldata->echo_commit == ldata->echo_tail)
> > > > 	return;
> > > > - so if the user turns off echo, echoes are not performed. But the
> > > > buffer
> > > > is not flushed. So when the user turns on echo again, previously
> > > > buffered
> > > > characters will be echoed. That is wrong.
> > > 
> > > The check for !L_ECHO pre-dates my patches; it might be wrong but
> > > userspace
> > > may have come to rely on this behavior. That said, feel free to submit a
> > > fix
> > > for that, if you think it's broken.
> > 
> > We should just clear the buffer on !L_ECHO. Or maybe (once we get rid of
> > the asynchronous buffer access) do not test here for L_ECHO at all - if
> > L_ECHO isn't set, then nothing is appended to the buffer. Consequently we
> > don't have to check for L_ECHO when we are flushing the buffer.
> 
> Discarding the echo buffer with L_ECHO -> !L_ECHO changes will almost 
> certainly break something. I considered attempting to push the echo 
> buffer when that change happens in n_tty_set_termios() but simply 
> haven't gotten to it for testing; and that still wouldn't get rid of the 
> need to check if echoes need to be pushed when !L_ECHO.

If there is !L_ECHO, than no characters are added to the echo buffer. 
Then, you test L_ECHO again, when flushing the echo buffer.

So I think there's a race condition:
1. L_ECHO is on
2. the user types some characters, they are added to the echo buffer
3. the program turns L_ECHO off
4. process_echoes and flush_echoes see that L_ECHO is off, so they don't 
   flush the buffer - but the buffer still contains the characters
5. some time passes
6. the program turns L_ECHO on
7. the characters typed in step 2. are still in the buffer and they are 
   echoed now - this is WRONG - the characters typed in step 2. should be
   either echoed immediatelly or not echoed at all

The kernel 3.11 doesn't have this bug (it doesn't check for L_ECHO in 
process_echoes).

> Regards,
> Peter Hurley

Mikulas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3.12] Broken terminal due to echo bufferring
  2013-12-11 16:29           ` Mikulas Patocka
@ 2013-12-13 21:24             ` Peter Hurley
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Hurley @ 2013-12-13 21:24 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, Karl Dahlke

On 12/11/2013 11:29 AM, Mikulas Patocka wrote:
>
>
> On Tue, 10 Dec 2013, Peter Hurley wrote:
>
>> On 12/09/2013 09:29 PM, Mikulas Patocka wrote:
>>>
>>>
>>> On Mon, 9 Dec 2013, Peter Hurley wrote:
>>>
>>>> On 12/09/2013 05:18 PM, Mikulas Patocka wrote:
>>>>>
>>>>> I still think you should drop this.
>>>>>
>>>>>
>>>>> The user types on the keyboard so slowly, that lock contention doesn't
>>>>> matter. Specialized programs that use terminal to transmit bulk data
>>>>> don't
>>>>> use cooked mode and echo. So I don't really see any use case where
>>>>> something depends on performance of echoed characters.
>>>>>
>>>>>
>>>>> Those patches just complicate the code for no benefit.
>>>>>
>>>>>
>>>>> When you read a variable that may be asynchronously modified, you need
>>>>> ACCESS_ONCE - for example you need this in process_echoes when accessing
>>>>> the variables outside the lock:
>>>>> if (ACCESS_ONCE(ldata->echo_commit) == ACCESS_ONCE(ldata->echo_tail))
>>>>
>>>> Not necessarily. Stale values in an SMP environment may not be a problem;
>>>> in this case, a possibly stale echo_tail simply means that the output_lock
>>>> will be obtained unnecessarily (which is cheaper every-so-often than
>>>> contending
>>>> over the echo_tail cache line every write, especially on x86 where there
>>>> is
>>>> no problem).
>>>
>>> Note that a single lock doesn't imply memory barrier:
>>> 	read(variable_1);
>>> 	spin_lock(lock);
>>> 	spin_unlock(lock);
>>> 	read(variable_2);
>>> may be reordered to
>>> 	spin_lock(lock);
>>> 	read(variable_2);
>>> 	read(variable_1);
>>> 	spin_unlock(lock);
>>>
>>> Two lock do imply a memory barrier. Surely, you can argue that the system
>>> takes at least two locks between reading the input queue and writing to
>>> the output to compensate for the missing memory barrier. But depending on
>>> such guarantees is dirty.
>>
>> To escape from n_tty_read() alone requires passing through (at a minimum)
>> 1. UNLOCK rwsem
>> 2. LOCK wq
>> 3. UNLOCK wq
>> 4. UNLOCK read serialization
>
> Sure, but you should just add the barriers if they are needed, don't rely
> on others doing barriers for you.

Sorry, but I'm not add extra memory barriers when adequate barriers already
exist, as designed and explained.

> Besides, there is another rw-semaphore
> implementation that doesn't have any barrier guarantees
> (include/linux/percpu-rwsem.h).

Which is why percpu-rwsem is not a drop-in replacement for regular rwsem.

>>> Also note, that you need ACCESS_ONCE if the variable may change. The
>>> compiler may assume during optimizations that the variables that are read
>>> don't change.
>>
>> Lots of (many? most?) kernel variables change asynchronously and are still
>> read without ACCESS_ONCE();  consider how waitqueue_active() works.
>
> So they should be read with ACCESS_ONCE().

You'll also want to 'fix' most of the existing locks, like rwsem which
peeks at the lock count outside any barriers (or any of the other highly-
contended paths in other locks as well).

>>> The compiler may even generate something like this when you read variable
>>> "v":
>>> 	movl	v, %eax
>>> 	cmpl	v, %eax
>>> 	jne	nowhere
>>> - of course it doesn't actually generate this code, but legally it could.
>>> ACCESS_ONCE is there to prevent this assumption.
>>
>> Not in this case. The compiler could not possibly prove the loads
>> are unnecessary: n_tty_write() is a globally visible call target.
>> (Besides, the load can't be optimized out because of the LOCK rwsem.)
>
> The compiler may assume that the variables that it is reading don't
> change. Usually it doesn't use this assumption (that why omitting
> ACCESS_ONCE in most cases doesn't result in any observable wrongdoing),
> but nonetheless, the compiler may use this assumption.
>
> For example, if the compiler proves that action1() and action2() don't
> change the variable, it may transform this piece of code:
> 	if (variable) {
> 		action1();
> 		action2();
> 		action3();
> 	} else {
> 		action2();
> 	}
> into this piece of code:
> 	if (variable)
> 		action1();
> 	action2();
> 	if (variable)
> 		action3();
>
> - obviously, if the variable changes asynchronously, it results in
> unintended bahavior.

Sure, but that's not the situation you brought up; you're suggesting
that the compiler will optimize out either/both loads of the echo buffer
indices that happen immediately following the LOCK rwsem (or assuming
the LOCK rwsem is removed, the beginning of a exported function).

The only way that happens is if the compiler and/or the LOCK barrier
is broken; so again, nothing here to fix.

>>> I suggest that you change commit_echoes to always write the buffer and
>>> flush it with tty->ops->flush_chars(tty). And then, you can drop
>>> process_echoes from n_tty_write. And then, there will be no asynchronous
>>> access to the buffer pointers.
>>
>> process_echoes() cannot be dropped from n_tty_write() because, even without
>> block commits, the driver's output buffer may be full and unable to accept
>> more input. That's why the echo buffer exists.
>
> Let me ask you this question:
>
> Does the function __process_echoes (in 3.12) or process_echoes (in 3.11)
> guarantee that the echo buffer is emptied and all characters in the buffer
> are sent to the terminal?

No. __process_echoes does not (and cannot) guarantee that the echo buffer
can be completely pushed (and would eliminate the need for a 4K echo
buffer if it could).

> - if it has this guarantee, then you don't need to call that function in
> n_tty_write. It is just enough to call it before adding the character to
> the input queue.
>
> - if it doesn't have this guarantee, then then the code is already buggy:
> suppose for example this race condition:
> 1. the user presses enter
> 2. '\n' is added to the echo buffer
> 3. the echo buffer is not flushed because tty_write_room returns zero
> 4. the program reads '\n' from the input buffer
> 5. the program writes the string "prompt>" to the terminal
> 6. n_tty_write calls process_echoes, it still doesn't echo anything
>     because tty_write_room returns zero
> 7. the terminal driver makes some progress and makes some room in its
>     buffer so that tty_write_room no longer returns zero
> 8. n_tty_write writes the string "prompt>" while '\n' is still sitting in
>     the echo buffer

Yep, known bug.

If you suggest the trivial fix is to always prefer echoes over outgoing
output, then that's just as buggy; one end will be able to starve the other
and will only ever see it's own writes.

An output 'clock' would be one way to fix this. Need a project? :)

> Another problem - if __process_echoes doesn't flush the echo buffer, who
> does flush it afterwards? You need to spawn a workqueue that waits on
> tty->write_wait and flushes the echo buffer when the terminal drivers
> makes some room.

Yep, again known bug, for the same reason, although more complicated:
how to handle signal-driven i/o.

>>>>> Another problem: what about this in process_echoes and flush_echoes?
>>>>> if (!L_ECHO(tty) || ldata->echo_commit == ldata->echo_tail)
>>>>> 	return;
>>>>> - so if the user turns off echo, echoes are not performed. But the
>>>>> buffer
>>>>> is not flushed. So when the user turns on echo again, previously
>>>>> buffered
>>>>> characters will be echoed. That is wrong.
>>>>
>>>> The check for !L_ECHO pre-dates my patches; it might be wrong but
>>>> userspace
>>>> may have come to rely on this behavior. That said, feel free to submit a
>>>> fix
>>>> for that, if you think it's broken.
>>>
>>> We should just clear the buffer on !L_ECHO. Or maybe (once we get rid of
>>> the asynchronous buffer access) do not test here for L_ECHO at all - if
>>> L_ECHO isn't set, then nothing is appended to the buffer. Consequently we
>>> don't have to check for L_ECHO when we are flushing the buffer.
>>
>> Discarding the echo buffer with L_ECHO -> !L_ECHO changes will almost
>> certainly break something. I considered attempting to push the echo
>> buffer when that change happens in n_tty_set_termios() but simply
>> haven't gotten to it for testing; and that still wouldn't get rid of the
>> need to check if echoes need to be pushed when !L_ECHO.
>
> If there is !L_ECHO, than no characters are added to the echo buffer.
> Then, you test L_ECHO again, when flushing the echo buffer.
>
> So I think there's a race condition:
> 1. L_ECHO is on
> 2. the user types some characters, they are added to the echo buffer
> 3. the program turns L_ECHO off
> 4. process_echoes and flush_echoes see that L_ECHO is off, so they don't
>     flush the buffer - but the buffer still contains the characters
> 5. some time passes
> 6. the program turns L_ECHO on
> 7. the characters typed in step 2. are still in the buffer and they are
>     echoed now - this is WRONG - the characters typed in step 2. should be
>     either echoed immediatelly or not echoed at all
>
> The kernel 3.11 doesn't have this bug (it doesn't check for L_ECHO in
> process_echoes).

Yeah, you're right that 3.11- doesn't check for !L_ECHO in the
n_tty_write() path; I'll send a patch to fix that (I must have been
looking at the wrong tree/branch when I wrote that earlier, sorry).

Regards,
Peter Hurley



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-12-13 21:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-09  2:55 [PATCH 3.12] Broken terminal due to echo bufferring Mikulas Patocka
2013-12-09  4:26 ` Karl Dahlke
2013-12-09 15:43 ` Peter Hurley
2013-12-09 22:18   ` Mikulas Patocka
2013-12-10  0:01     ` Peter Hurley
2013-12-10  2:29       ` Mikulas Patocka
2013-12-10 10:48         ` Karl Dahlke
2013-12-10 11:00           ` Peter Hurley
2013-12-10 10:52         ` Karl Dahlke
2013-12-10 14:34         ` Peter Hurley
2013-12-11 16:29           ` Mikulas Patocka
2013-12-13 21:24             ` Peter Hurley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox