All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c
@ 2015-10-02  8:27 Kosuke Tatsukawa
  2015-10-09 17:28 ` Peter Hurley
  2015-10-09 20:08 ` Peter Hurley
  0 siblings, 2 replies; 5+ messages in thread
From: Kosuke Tatsukawa @ 2015-10-02  8:27 UTC (permalink / raw)
  To: Peter Hurley, Greg Kroah-Hartman, Jiri Slaby; +Cc: linux-kernel@vger.kernel.org

My colleague ran into a program stall on a x86_64 server, where
n_tty_read() was waiting for data even if there was data in the buffer
in the pty.  kernel stack for the stuck process looks like below.
 #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
 #1 [ffff88303d107bd0] schedule at ffffffff815c513e
 #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
 #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
 #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
 #5 [ffff88303d107dd0] tty_read at ffffffff81368013
 #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
 #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
 #8 [ffff88303d107f00] sys_read at ffffffff811a4306
 #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7

There seems to be two problems causing this issue.

First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
updates ldata->commit_head using smp_store_release() and then checks
the wait queue using waitqueue_active().  However, since there is no
memory barrier, __receive_buf() could return without calling
wake_up_interactive_poll(), and at the same time, n_tty_read() could
start to wait in wait_woken() as in the following chart.

        __receive_buf()                         n_tty_read()
------------------------------------------------------------------------
if (waitqueue_active(&tty->read_wait))
/* Memory operations issued after the
   RELEASE may be completed before the
   RELEASE operation has completed */
                                        add_wait_queue(&tty->read_wait, &wait);
                                        ...
                                        if (!input_available_p(tty, 0)) {
smp_store_release(&ldata->commit_head,
                  ldata->read_head);
                                        ...
                                        timeout = wait_woken(&wait,
                                          TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------

The second problem is that n_tty_read() also lacks a memory barrier
call and could also cause __receive_buf() to return without calling
wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
as in the chart below.

        __receive_buf()                         n_tty_read()
------------------------------------------------------------------------
                                        spin_lock_irqsave(&q->lock, flags);
                                        /* from add_wait_queue() */
                                        ...
                                        if (!input_available_p(tty, 0)) {
                                        /* Memory operations issued after the
                                           RELEASE may be completed before the
                                           RELEASE operation has completed */
smp_store_release(&ldata->commit_head,
                  ldata->read_head);
if (waitqueue_active(&tty->read_wait))
                                        __add_wait_queue(q, wait);
                                        spin_unlock_irqrestore(&q->lock,flags);
                                        /* from add_wait_queue() */
                                        ...
                                        timeout = wait_woken(&wait,
                                          TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------

There are also other places in drivers/tty/n_tty.c which have similar
calls to waitqueue_active(), so instead of adding many memory barrier
calls, this patch simply removes the call to waitqueue_active(),
leaving just wake_up*() behind.

This fixes both problems because, even though the memory access before
or after the spinlocks in both wake_up*() and add_wait_queue() can
sneak into the critical section, it cannot go past it and the critical
section assures that they will be serialized (please see "INTER-CPU
ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
better explanation).  Moreover, the resulting code is much simpler.

Latency measurement using a ping-pong test over a pty doesn't show any
visible performance drop.

Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com>
Cc: stable@vger.kernel.org
---
v2:
  - Removed unnecessary hunks from v1 based on comments from Peter.
v1:
  - https://lkml.org/lkml/2015/9/28/849
---
 drivers/tty/n_tty.c |   15 +++++----------
 1 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index 20932cc..b09023b 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -343,8 +343,7 @@ static void n_tty_packet_mode_flush(struct tty_struct *tty)
 		spin_lock_irqsave(&tty->ctrl_lock, flags);
 		tty->ctrl_status |= TIOCPKT_FLUSHREAD;
 		spin_unlock_irqrestore(&tty->ctrl_lock, flags);
-		if (waitqueue_active(&tty->link->read_wait))
-			wake_up_interruptible(&tty->link->read_wait);
+		wake_up_interruptible(&tty->link->read_wait);
 	}
 }
 
@@ -1382,8 +1381,7 @@ handle_newline:
 			put_tty_queue(c, ldata);
 			smp_store_release(&ldata->canon_head, ldata->read_head);
 			kill_fasync(&tty->fasync, SIGIO, POLL_IN);
-			if (waitqueue_active(&tty->read_wait))
-				wake_up_interruptible_poll(&tty->read_wait, POLLIN);
+			wake_up_interruptible_poll(&tty->read_wait, POLLIN);
 			return 0;
 		}
 	}
@@ -1667,8 +1665,7 @@ static void __receive_buf(struct tty_struct *tty, const unsigned char *cp,
 
 	if ((read_cnt(ldata) >= ldata->minimum_to_wake) || L_EXTPROC(tty)) {
 		kill_fasync(&tty->fasync, SIGIO, POLL_IN);
-		if (waitqueue_active(&tty->read_wait))
-			wake_up_interruptible_poll(&tty->read_wait, POLLIN);
+		wake_up_interruptible_poll(&tty->read_wait, POLLIN);
 	}
 }
 
@@ -1887,10 +1884,8 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old)
 	}
 
 	/* The termios change make the tty ready for I/O */
-	if (waitqueue_active(&tty->write_wait))
-		wake_up_interruptible(&tty->write_wait);
-	if (waitqueue_active(&tty->read_wait))
-		wake_up_interruptible(&tty->read_wait);
+	wake_up_interruptible(&tty->write_wait);
+	wake_up_interruptible(&tty->read_wait);
 }
 
 /**
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c
  2015-10-02  8:27 [PATCH v2] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c Kosuke Tatsukawa
@ 2015-10-09 17:28 ` Peter Hurley
  2015-10-09 20:22   ` Peter Hurley
  2015-10-09 20:08 ` Peter Hurley
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Hurley @ 2015-10-09 17:28 UTC (permalink / raw)
  To: Kosuke Tatsukawa
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel@vger.kernel.org

On 10/02/2015 04:27 AM, Kosuke Tatsukawa wrote:
> My colleague ran into a program stall on a x86_64 server, where
> n_tty_read() was waiting for data even if there was data in the buffer
> in the pty.  kernel stack for the stuck process looks like below.
>  #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
>  #1 [ffff88303d107bd0] schedule at ffffffff815c513e
>  #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
>  #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
>  #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
>  #5 [ffff88303d107dd0] tty_read at ffffffff81368013
>  #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
>  #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
>  #8 [ffff88303d107f00] sys_read at ffffffff811a4306
>  #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7
> 
> There seems to be two problems causing this issue.
> 
> First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
> updates ldata->commit_head using smp_store_release() and then checks
> the wait queue using waitqueue_active().  However, since there is no
> memory barrier, __receive_buf() could return without calling
> wake_up_interactive_poll(), and at the same time, n_tty_read() could
> start to wait in wait_woken() as in the following chart.
> 
>         __receive_buf()                         n_tty_read()
> ------------------------------------------------------------------------
> if (waitqueue_active(&tty->read_wait))
> /* Memory operations issued after the
>    RELEASE may be completed before the
>    RELEASE operation has completed */
>                                         add_wait_queue(&tty->read_wait, &wait);
>                                         ...
>                                         if (!input_available_p(tty, 0)) {
> smp_store_release(&ldata->commit_head,
>                   ldata->read_head);
>                                         ...
>                                         timeout = wait_woken(&wait,
>                                           TASK_INTERRUPTIBLE, timeout);
> ------------------------------------------------------------------------

Tatsukawa-san,

I would still like to root-cause the reported stall; is the reported
stall resolved if smp_mb() is added before the waitqueue_active()
in __receive_buf()?


Regards,
Peter Hurley

> The second problem is that n_tty_read() also lacks a memory barrier
> call and could also cause __receive_buf() to return without calling
> wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
> as in the chart below.
> 
>         __receive_buf()                         n_tty_read()
> ------------------------------------------------------------------------
>                                         spin_lock_irqsave(&q->lock, flags);
>                                         /* from add_wait_queue() */
>                                         ...
>                                         if (!input_available_p(tty, 0)) {
>                                         /* Memory operations issued after the
>                                            RELEASE may be completed before the
>                                            RELEASE operation has completed */
> smp_store_release(&ldata->commit_head,
>                   ldata->read_head);
> if (waitqueue_active(&tty->read_wait))
>                                         __add_wait_queue(q, wait);
>                                         spin_unlock_irqrestore(&q->lock,flags);
>                                         /* from add_wait_queue() */
>                                         ...
>                                         timeout = wait_woken(&wait,
>                                           TASK_INTERRUPTIBLE, timeout);
> ------------------------------------------------------------------------
> 
> There are also other places in drivers/tty/n_tty.c which have similar
> calls to waitqueue_active(), so instead of adding many memory barrier
> calls, this patch simply removes the call to waitqueue_active(),
> leaving just wake_up*() behind.
> 
> This fixes both problems because, even though the memory access before
> or after the spinlocks in both wake_up*() and add_wait_queue() can
> sneak into the critical section, it cannot go past it and the critical
> section assures that they will be serialized (please see "INTER-CPU
> ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
> better explanation).  Moreover, the resulting code is much simpler.
> 
> Latency measurement using a ping-pong test over a pty doesn't show any
> visible performance drop.
> 
> Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com>
> Cc: stable@vger.kernel.org
> ---
> v2:
>   - Removed unnecessary hunks from v1 based on comments from Peter.
> v1:
>   - https://lkml.org/lkml/2015/9/28/849
> ---
>  drivers/tty/n_tty.c |   15 +++++----------
>  1 files changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
> index 20932cc..b09023b 100644
> --- a/drivers/tty/n_tty.c
> +++ b/drivers/tty/n_tty.c
> @@ -343,8 +343,7 @@ static void n_tty_packet_mode_flush(struct tty_struct *tty)
>  		spin_lock_irqsave(&tty->ctrl_lock, flags);
>  		tty->ctrl_status |= TIOCPKT_FLUSHREAD;
>  		spin_unlock_irqrestore(&tty->ctrl_lock, flags);
> -		if (waitqueue_active(&tty->link->read_wait))
> -			wake_up_interruptible(&tty->link->read_wait);
> +		wake_up_interruptible(&tty->link->read_wait);
>  	}
>  }
>  
> @@ -1382,8 +1381,7 @@ handle_newline:
>  			put_tty_queue(c, ldata);
>  			smp_store_release(&ldata->canon_head, ldata->read_head);
>  			kill_fasync(&tty->fasync, SIGIO, POLL_IN);
> -			if (waitqueue_active(&tty->read_wait))
> -				wake_up_interruptible_poll(&tty->read_wait, POLLIN);
> +			wake_up_interruptible_poll(&tty->read_wait, POLLIN);
>  			return 0;
>  		}
>  	}
> @@ -1667,8 +1665,7 @@ static void __receive_buf(struct tty_struct *tty, const unsigned char *cp,
>  
>  	if ((read_cnt(ldata) >= ldata->minimum_to_wake) || L_EXTPROC(tty)) {
>  		kill_fasync(&tty->fasync, SIGIO, POLL_IN);
> -		if (waitqueue_active(&tty->read_wait))
> -			wake_up_interruptible_poll(&tty->read_wait, POLLIN);
> +		wake_up_interruptible_poll(&tty->read_wait, POLLIN);
>  	}
>  }
>  
> @@ -1887,10 +1884,8 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old)
>  	}
>  
>  	/* The termios change make the tty ready for I/O */
> -	if (waitqueue_active(&tty->write_wait))
> -		wake_up_interruptible(&tty->write_wait);
> -	if (waitqueue_active(&tty->read_wait))
> -		wake_up_interruptible(&tty->read_wait);
> +	wake_up_interruptible(&tty->write_wait);
> +	wake_up_interruptible(&tty->read_wait);
>  }
>  
>  /**
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c
  2015-10-02  8:27 [PATCH v2] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c Kosuke Tatsukawa
  2015-10-09 17:28 ` Peter Hurley
@ 2015-10-09 20:08 ` Peter Hurley
  1 sibling, 0 replies; 5+ messages in thread
From: Peter Hurley @ 2015-10-09 20:08 UTC (permalink / raw)
  To: Kosuke Tatsukawa, Greg Kroah-Hartman, Jiri Slaby
  Cc: linux-kernel@vger.kernel.org

On 10/02/2015 04:27 AM, Kosuke Tatsukawa wrote:
> My colleague ran into a program stall on a x86_64 server, where
> n_tty_read() was waiting for data even if there was data in the buffer
> in the pty.  kernel stack for the stuck process looks like below.
>  #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
>  #1 [ffff88303d107bd0] schedule at ffffffff815c513e
>  #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
>  #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
>  #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
>  #5 [ffff88303d107dd0] tty_read at ffffffff81368013
>  #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
>  #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
>  #8 [ffff88303d107f00] sys_read at ffffffff811a4306
>  #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7
> 
> There seems to be two problems causing this issue.
> 
> First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
> updates ldata->commit_head using smp_store_release() and then checks
> the wait queue using waitqueue_active().  However, since there is no
> memory barrier, __receive_buf() could return without calling
> wake_up_interactive_poll(), and at the same time, n_tty_read() could
> start to wait in wait_woken() as in the following chart.
> 
>         __receive_buf()                         n_tty_read()
> ------------------------------------------------------------------------
> if (waitqueue_active(&tty->read_wait))
> /* Memory operations issued after the
>    RELEASE may be completed before the
>    RELEASE operation has completed */
>                                         add_wait_queue(&tty->read_wait, &wait);
>                                         ...
>                                         if (!input_available_p(tty, 0)) {
> smp_store_release(&ldata->commit_head,
>                   ldata->read_head);
>                                         ...
>                                         timeout = wait_woken(&wait,
>                                           TASK_INTERRUPTIBLE, timeout);
> ------------------------------------------------------------------------
> 
> The second problem is that n_tty_read() also lacks a memory barrier
> call and could also cause __receive_buf() to return without calling
> wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
> as in the chart below.
> 
>         __receive_buf()                         n_tty_read()
> ------------------------------------------------------------------------
>                                         spin_lock_irqsave(&q->lock, flags);
>                                         /* from add_wait_queue() */
>                                         ...
>                                         if (!input_available_p(tty, 0)) {
>                                         /* Memory operations issued after the
>                                            RELEASE may be completed before the
>                                            RELEASE operation has completed */
> smp_store_release(&ldata->commit_head,
>                   ldata->read_head);
> if (waitqueue_active(&tty->read_wait))
>                                         __add_wait_queue(q, wait);
>                                         spin_unlock_irqrestore(&q->lock,flags);
>                                         /* from add_wait_queue() */
>                                         ...
>                                         timeout = wait_woken(&wait,
>                                           TASK_INTERRUPTIBLE, timeout);
> ------------------------------------------------------------------------
> 
> There are also other places in drivers/tty/n_tty.c which have similar
> calls to waitqueue_active(), so instead of adding many memory barrier
> calls, this patch simply removes the call to waitqueue_active(),
> leaving just wake_up*() behind.
> 
> This fixes both problems because, even though the memory access before
> or after the spinlocks in both wake_up*() and add_wait_queue() can
> sneak into the critical section, it cannot go past it and the critical
> section assures that they will be serialized (please see "INTER-CPU
> ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
> better explanation).  Moreover, the resulting code is much simpler.
> 
> Latency measurement using a ping-pong test over a pty doesn't show any
> visible performance drop.

Reviewed-by: Peter Hurley <peter@hurleysoftware.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c
  2015-10-09 17:28 ` Peter Hurley
@ 2015-10-09 20:22   ` Peter Hurley
  2015-10-10  5:04     ` Kosuke Tatsukawa
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Hurley @ 2015-10-09 20:22 UTC (permalink / raw)
  To: Kosuke Tatsukawa
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel@vger.kernel.org

On 10/09/2015 01:28 PM, Peter Hurley wrote:
> Tatsukawa-san,
> 
> I would still like to root-cause the reported stall; is the reported
> stall resolved if smp_mb() is added before the waitqueue_active()
> in __receive_buf()?

Nevermind, I see it now.

The store to commit_head is deferred until after the load of read_wait->next;
a full memory barrier would properly order the store before the load but,
since that is roughly equivalent to taking the spin lock for the wake up anyway,
it makes sense to just always do the wakeup.

Thanks again.

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c
  2015-10-09 20:22   ` Peter Hurley
@ 2015-10-10  5:04     ` Kosuke Tatsukawa
  0 siblings, 0 replies; 5+ messages in thread
From: Kosuke Tatsukawa @ 2015-10-10  5:04 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel@vger.kernel.org

Peter Hurley wrote:
> On 10/09/2015 01:28 PM, Peter Hurley wrote:
>> Tatsukawa-san,
>> 
>> I would still like to root-cause the reported stall; is the reported
>> stall resolved if smp_mb() is added before the waitqueue_active()
>> in __receive_buf()?
>
> Nevermind, I see it now.
>
> The store to commit_head is deferred until after the load of read_wait->next;
> a full memory barrier would properly order the store before the load but,
> since that is roughly equivalent to taking the spin lock for the wake up anyway,
> it makes sense to just always do the wakeup.

The stall problem is resolved if smp_mb() is added both before the
waitqueue_active() in __receive_buf(), and before the line containing
input_available_p() in n_tty_read().
---
Kosuke TATSUKAWA  | 3rd IT Platform Department
                  | IT Platform Division, NEC Corporation
                  | tatsu@ab.jp.nec.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-10-10  5:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-02  8:27 [PATCH v2] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c Kosuke Tatsukawa
2015-10-09 17:28 ` Peter Hurley
2015-10-09 20:22   ` Peter Hurley
2015-10-10  5:04     ` Kosuke Tatsukawa
2015-10-09 20:08 ` Peter Hurley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.