From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751918AbbIQOu1 (ORCPT ); Thu, 17 Sep 2015 10:50:27 -0400 Received: from mail-ig0-f169.google.com ([209.85.213.169]:37156 "EHLO mail-ig0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751827AbbIQOuZ (ORCPT ); Thu, 17 Sep 2015 10:50:25 -0400 Message-ID: <1442501401.12852.1.camel@primarydata.com> Subject: Re: [PATCH] SUNRPC: Fix a race in xs_reset_transport From: Trond Myklebust To: Jeff Layton Cc: "Suzuki K. Poulose" , Anna Schumaker , "J. Bruce Fields" , "David S. Miller" , Linux NFS Mailing List , Linux Kernel Mailing List Date: Thu, 17 Sep 2015 10:50:01 -0400 In-Reply-To: <20150917101847.74ee85ac@synchrony.poochiereds.net> References: <1442332163-9230-1-git-send-email-suzuki.poulose@arm.com> <20150915145229.4e69d5f7@synchrony.poochiereds.net> <20150917101847.74ee85ac@synchrony.poochiereds.net> Organization: PrimaryData, Inc Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.5 (3.16.5-1.fc22) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2015-09-17 at 10:18 -0400, Jeff Layton wrote: > On Thu, 17 Sep 2015 09:38:33 -0400 > Trond Myklebust wrote: > > > On Tue, Sep 15, 2015 at 2:52 PM, Jeff Layton < > > jlayton@poochiereds.net> wrote: > > > On Tue, 15 Sep 2015 16:49:23 +0100 > > > "Suzuki K. Poulose" wrote: > > > > > > > net/sunrpc/xprtsock.c | 9 ++++++++- > > > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > > > > index 7be90bc..6f4789d 100644 > > > > --- a/net/sunrpc/xprtsock.c > > > > +++ b/net/sunrpc/xprtsock.c > > > > @@ -822,9 +822,16 @@ static void xs_reset_transport(struct > > > > sock_xprt *transport) > > > > if (atomic_read(&transport->xprt.swapper)) > > > > sk_clear_memalloc(sk); > > > > > > > > - kernel_sock_shutdown(sock, SHUT_RDWR); > > > > + if (sock) > > > > + kernel_sock_shutdown(sock, SHUT_RDWR); > > > > > > > > > > Good catch, but...isn't this still racy? What prevents transport > > > ->sock > > > being set to NULL after you assign it to "sock" but before > > > calling > > > kernel_sock_shutdown? > > > > The XPRT_LOCKED state. > > > > IDGI -- if the XPRT_LOCKED bit was supposed to prevent that, then > how could you hit the original race? There should be no concurrent > callers to xs_reset_transport on the same xprt, right? Correct. The only exception is xs_destroy. > AFAICT, that bit is not set in the xprt_destroy codepath, which may > be > the root cause of the problem. How would we take it there anyway? > xprt_destroy is void return, and may not be called in the context of > a > rpc_task. If it's contended, what do we do? Sleep until it's > cleared? > How about the following. 8<----------------------------------------------------------------- >>From e2e68218e66c6b0715fd6b8f1b3092694a7c0e62 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Thu, 17 Sep 2015 10:42:27 -0400 Subject: [PATCH] SUNRPC: Fix races between socket connection and destroy code When we're destroying the socket transport, we need to ensure that we cancel any existing delayed connection attempts, and order them w.r.t. the call to xs_close(). Signed-off-by: Trond Myklebust --- net/sunrpc/xprtsock.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 7be90bc1a7c2..d2dfbd043bea 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -881,8 +881,11 @@ static void xs_xprt_free(struct rpc_xprt *xprt) */ static void xs_destroy(struct rpc_xprt *xprt) { + struct sock_xprt *transport = container_of(xprt, + struct sock_xprt, xprt); dprintk("RPC: xs_destroy xprt %p\n", xprt); + cancel_delayed_work_sync(&transport->connect_worker); xs_close(xprt); xs_xprt_free(xprt); module_put(THIS_MODULE); -- 2.4.3 -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com