From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linuxfoundation.org ([140.211.169.12]:48503 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757452AbcJZISC (ORCPT ); Wed, 26 Oct 2016 04:18:02 -0400 Subject: Patch "sunrpc: fix write space race causing stalls" has been added to the 4.8-stable tree To: david.vrabel@citrix.com, Anna.Schumaker@Netapp.com, gregkh@linuxfoundation.org, trondmy@primarydata.com Cc: , From: Date: Wed, 26 Oct 2016 10:16:26 +0200 Message-ID: <147746978617986@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org List-ID: This is a note to let you know that I've just added the patch titled sunrpc: fix write space race causing stalls to the 4.8-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: sunrpc-fix-write-space-race-causing-stalls.patch and it can be found in the queue-4.8 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >>From d48f9ce73c997573e1b512893fa6eddf353a6f69 Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Mon, 19 Sep 2016 13:58:30 +0100 Subject: sunrpc: fix write space race causing stalls From: David Vrabel commit d48f9ce73c997573e1b512893fa6eddf353a6f69 upstream. Write space becoming available may race with putting the task to sleep in xprt_wait_for_buffer_space(). The existing mechanism to avoid the race does not work. This (edited) partial trace illustrates the problem: [1] rpc_task_run_action: task:43546@5 ... action=call_transmit [2] xs_write_space <-xs_tcp_write_space [3] xprt_write_space <-xs_write_space [4] rpc_task_sleep: task:43546@5 ... [5] xs_write_space <-xs_tcp_write_space [1] Task 43546 runs but is out of write space. [2] Space becomes available, xs_write_space() clears the SOCKWQ_ASYNC_NOSPACE bit. [3] xprt_write_space() attemts to wake xprt->snd_task (== 43546), but this has not yet been queued and the wake up is lost. [4] xs_nospace() is called which calls xprt_wait_for_buffer_space() which queues task 43546. [5] The call to sk->sk_write_space() at the end of xs_nospace() (which is supposed to handle the above race) does not call xprt_write_space() as the SOCKWQ_ASYNC_NOSPACE bit is clear and thus the task is not woken. Fix the race by resetting the SOCKWQ_ASYNC_NOSPACE bit in xs_nospace() so the second call to sk->sk_write_space() calls xprt_write_space(). Suggested-by: Trond Myklebust Signed-off-by: David Vrabel Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman --- net/sunrpc/xprtsock.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -473,7 +473,16 @@ static int xs_nospace(struct rpc_task *t spin_unlock_bh(&xprt->transport_lock); /* Race breaker in case memory is freed before above code is called */ - sk->sk_write_space(sk); + if (ret == -EAGAIN) { + struct socket_wq *wq; + + rcu_read_lock(); + wq = rcu_dereference(sk->sk_wq); + set_bit(SOCKWQ_ASYNC_NOSPACE, &wq->flags); + rcu_read_unlock(); + + sk->sk_write_space(sk); + } return ret; } Patches currently in stable-queue which might be from david.vrabel@citrix.com are queue-4.8/sunrpc-fix-write-space-race-causing-stalls.patch