From: Trond Myklebust <trond.myklebust@primarydata.com>
To: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com>,
Jeff Layton <jlayton@poochiereds.net>
Cc: Anna Schumaker <anna.schumaker@netapp.com>,
"J. Bruce Fields" <bfields@fieldses.org>,
"David S. Miller" <davem@davemloft.net>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Marc Zyngier <Marc.Zyngier@arm.com>
Subject: Re: [PATCHv2] SUNRPC: Fix a race in xs_reset_transport
Date: Fri, 18 Sep 2015 18:00:07 -0400 [thread overview]
Message-ID: <1442613607.11370.18.camel@primarydata.com> (raw)
In-Reply-To: <1442595095.11370.13.camel@primarydata.com>
On Fri, 2015-09-18 at 12:51 -0400, Trond Myklebust wrote:
> On Fri, 2015-09-18 at 12:19 +0100, Suzuki K. Poulose wrote:
> > On 16/09/15 12:17, Jeff Layton wrote:
> > > On Wed, 16 Sep 2015 10:35:49 +0100
> > > "Suzuki K. Poulose" <suzuki.poulose@arm.com> wrote:
> > >
> > > > From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
> > > >
> >
> > ...
> >
> > > > + write_unlock_bh(&sk->sk_callback_lock);
> > > > + return;
> > > > + }
> > > > + sock = transport->sock;
> > > > +
> > > > transport->inet = NULL;
> > > > transport->sock = NULL;
> > > >
> > > > @@ -833,6 +838,10 @@ static void xs_reset_transport(struct
> > > > sock_xprt *transport)
> > > > xs_restore_old_callbacks(transport, sk);
> > > > xprt_clear_connected(xprt);
> > > > write_unlock_bh(&sk->sk_callback_lock);
> > > > +
> > > > + if (sock)
> > > > + kernel_sock_shutdown(sock, SHUT_RDWR);
> > > > +
> > > > xs_sock_reset_connection_flags(xprt);
> > > >
> > > > trace_rpc_socket_close(xprt, sock);
> > >
> > > Better, but now I'm wondering...is it problematic to restore the
> > > old
> > > callbacks before calling kernel_sock_shutdown? I can't quite tell
> > > whether it matters in all cases.
> > >
> > > It might be best to just go ahead and take the spinlock twice
> > > here.
> > > Do
> > > it once to clear the transport->sock pointer, call
> > > kernel_sock_shutdown, and then take it again to restore the old
> > > callbacks, etc.
> > >
> > > I don't know though...I get the feeling there are races all over
> > > the
> > > place in this code. It seems like there's a similar one wrt to
> > > the
> > > transport->inet pointer. It seems a little silly that we clear it
> > > under
> > > the sk->sk_callback_lock. You have to dereference that pointer
> > > in order to get to the lock.
> > >
> > > Maybe the right solution is to use an xchg to swap the inet
> > > pointer
> > > with NULL so it can act as a gatekeeper. Whoever gets there first
> > > does
> > > the rest of the shutdown.
> > >
> > > Something like this maybe? Would this also fix the original
> > > problem?
> > > Note that this patch is untested...
> > >
> > > [PATCH] sunrpc: use xchg to fetch and clear the transport->inet
> > > pointer in xs_reset_transport
> > >
> > > Reported-by: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
> > > Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
> > > ---
> > > net/sunrpc/xprtsock.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > > index 7be90bc1a7c2..57f79dcab493 100644
> > > --- a/net/sunrpc/xprtsock.c
> > > +++ b/net/sunrpc/xprtsock.c
> > > @@ -813,9 +813,10 @@ static void xs_error_report(struct sock *sk)
> > > static void xs_reset_transport(struct sock_xprt *transport)
> > > {
> > > struct socket *sock = transport->sock;
> > > - struct sock *sk = transport->inet;
> > > + struct sock *sk;
> > > struct rpc_xprt *xprt = &transport->xprt;
> > >
> > > + sk = xchg(&transport->inet, NULL);
> > > if (sk == NULL)
> > > return;
> > >
> > > @@ -825,7 +826,6 @@ static void xs_reset_transport(struct
> > > sock_xprt
> > > *transport)
> > > kernel_sock_shutdown(sock, SHUT_RDWR);
> > >
> > > write_lock_bh(&sk->sk_callback_lock);
> > > - transport->inet = NULL;
> > > transport->sock = NULL;
> > >
> > > sk->sk_user_data = NULL;
> > >
> >
> >
> > This one seemed to fix it, so if it matters :
> >
> > Tested-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
>
>
> I don't think it does. It addresses a symptom, but the actual problem
> is that we're running 2 parallel close() calls on the same socket
> during a shutdown. That must not happen because it means we have
> something else trying to use the socket while it is being freed.
>
> I think what is happening is that we're triggering the socket
> autoclose
> mechanism from the state change callback. You're seeing the problem
> more frequently because we added the call to kernel_sock_shutdown()
> as
> part of the socket shutdown, but AFAICS, it could still be triggered
> from some external event such as a server-initiated shutdown that
> happened at the same time.
> In fact, looking at the code, it could even be triggered from the
> data
> receive side of things.
> Both of these things are bad, because autoclose puts the transport
> struct that is being freed onto a workqueue. That again can lead to a
> really bad use-after-free situation if the timing is just a little
> different.
>
> So how about the following patch? It should apply cleanly on top of
> the
> first one (which is still needed, btw).
Having thought some more about this, I think the safest thing in order
to avoid races is simply to have the shutdown set XPRT_LOCKED. That way
we can keep the current desirable behaviour of closing the socket
automatically any time the server initiates a close, while still
preventing it during shutdown.
8<-------------------------------------------------------------------
WARNING: multiple messages have this Message-ID (diff)
From: Trond Myklebust <trond.myklebust@primarydata.com>
To: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com>,
Jeff Layton <jlayton@poochiereds.net>
Cc: Anna Schumaker <anna.schumaker@netapp.com>,
"J. Bruce Fields" <bfields@fieldses.org>,
"David S. Miller" <davem@davemloft.net>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Marc Zyngier <Marc.Zyngier@arm.com>
Subject: Re: [PATCHv2] SUNRPC: Fix a race in xs_reset_transport
Date: Fri, 18 Sep 2015 18:00:07 -0400 [thread overview]
Message-ID: <1442613607.11370.18.camel@primarydata.com> (raw)
In-Reply-To: <1442595095.11370.13.camel@primarydata.com>
On Fri, 2015-09-18 at 12:51 -0400, Trond Myklebust wrote:
> On Fri, 2015-09-18 at 12:19 +0100, Suzuki K. Poulose wrote:
> > On 16/09/15 12:17, Jeff Layton wrote:
> > > On Wed, 16 Sep 2015 10:35:49 +0100
> > > "Suzuki K. Poulose" <suzuki.poulose@arm.com> wrote:
> > >
> > > > From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
> > > >
> >
> > ...
> >
> > > > + write_unlock_bh(&sk->sk_callback_lock);
> > > > + return;
> > > > + }
> > > > + sock = transport->sock;
> > > > +
> > > > transport->inet = NULL;
> > > > transport->sock = NULL;
> > > >
> > > > @@ -833,6 +838,10 @@ static void xs_reset_transport(struct
> > > > sock_xprt *transport)
> > > > xs_restore_old_callbacks(transport, sk);
> > > > xprt_clear_connected(xprt);
> > > > write_unlock_bh(&sk->sk_callback_lock);
> > > > +
> > > > + if (sock)
> > > > + kernel_sock_shutdown(sock, SHUT_RDWR);
> > > > +
> > > > xs_sock_reset_connection_flags(xprt);
> > > >
> > > > trace_rpc_socket_close(xprt, sock);
> > >
> > > Better, but now I'm wondering...is it problematic to restore the
> > > old
> > > callbacks before calling kernel_sock_shutdown? I can't quite tell
> > > whether it matters in all cases.
> > >
> > > It might be best to just go ahead and take the spinlock twice
> > > here.
> > > Do
> > > it once to clear the transport->sock pointer, call
> > > kernel_sock_shutdown, and then take it again to restore the old
> > > callbacks, etc.
> > >
> > > I don't know though...I get the feeling there are races all over
> > > the
> > > place in this code. It seems like there's a similar one wrt to
> > > the
> > > transport->inet pointer. It seems a little silly that we clear it
> > > under
> > > the sk->sk_callback_lock. You have to dereference that pointer
> > > in order to get to the lock.
> > >
> > > Maybe the right solution is to use an xchg to swap the inet
> > > pointer
> > > with NULL so it can act as a gatekeeper. Whoever gets there first
> > > does
> > > the rest of the shutdown.
> > >
> > > Something like this maybe? Would this also fix the original
> > > problem?
> > > Note that this patch is untested...
> > >
> > > [PATCH] sunrpc: use xchg to fetch and clear the transport->inet
> > > pointer in xs_reset_transport
> > >
> > > Reported-by: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
> > > Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
> > > ---
> > > net/sunrpc/xprtsock.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > > index 7be90bc1a7c2..57f79dcab493 100644
> > > --- a/net/sunrpc/xprtsock.c
> > > +++ b/net/sunrpc/xprtsock.c
> > > @@ -813,9 +813,10 @@ static void xs_error_report(struct sock *sk)
> > > static void xs_reset_transport(struct sock_xprt *transport)
> > > {
> > > struct socket *sock = transport->sock;
> > > - struct sock *sk = transport->inet;
> > > + struct sock *sk;
> > > struct rpc_xprt *xprt = &transport->xprt;
> > >
> > > + sk = xchg(&transport->inet, NULL);
> > > if (sk == NULL)
> > > return;
> > >
> > > @@ -825,7 +826,6 @@ static void xs_reset_transport(struct
> > > sock_xprt
> > > *transport)
> > > kernel_sock_shutdown(sock, SHUT_RDWR);
> > >
> > > write_lock_bh(&sk->sk_callback_lock);
> > > - transport->inet = NULL;
> > > transport->sock = NULL;
> > >
> > > sk->sk_user_data = NULL;
> > >
> >
> >
> > This one seemed to fix it, so if it matters :
> >
> > Tested-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
>
>
> I don't think it does. It addresses a symptom, but the actual problem
> is that we're running 2 parallel close() calls on the same socket
> during a shutdown. That must not happen because it means we have
> something else trying to use the socket while it is being freed.
>
> I think what is happening is that we're triggering the socket
> autoclose
> mechanism from the state change callback. You're seeing the problem
> more frequently because we added the call to kernel_sock_shutdown()
> as
> part of the socket shutdown, but AFAICS, it could still be triggered
> from some external event such as a server-initiated shutdown that
> happened at the same time.
> In fact, looking at the code, it could even be triggered from the
> data
> receive side of things.
> Both of these things are bad, because autoclose puts the transport
> struct that is being freed onto a workqueue. That again can lead to a
> really bad use-after-free situation if the timing is just a little
> different.
>
> So how about the following patch? It should apply cleanly on top of
> the
> first one (which is still needed, btw).
Having thought some more about this, I think the safest thing in order
to avoid races is simply to have the shutdown set XPRT_LOCKED. That way
we can keep the current desirable behaviour of closing the socket
automatically any time the server initiates a close, while still
preventing it during shutdown.
8<-------------------------------------------------------------------
>From 3e1c9d8092e2fa4509d84a00fcf21e7e0c581fe2 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@primarydata.com>
Date: Fri, 18 Sep 2015 15:53:24 -0400
Subject: [PATCH] SUNRPC: Lock the transport layer on shutdown
Avoid all races with the connect/disconnect handlers by taking the
transport lock.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
net/sunrpc/xprt.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index ab5dd621ae0c..2e98f4a243e5 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -614,6 +614,7 @@ static void xprt_autoclose(struct work_struct *work)
clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
xprt->ops->close(xprt);
xprt_release_write(xprt, NULL);
+ wake_up_bit(&xprt->state, XPRT_LOCKED);
}
/**
@@ -723,6 +724,7 @@ void xprt_unlock_connect(struct rpc_xprt *xprt, void *cookie)
xprt->ops->release_xprt(xprt, NULL);
out:
spin_unlock_bh(&xprt->transport_lock);
+ wake_up_bit(&xprt->state, XPRT_LOCKED);
}
/**
@@ -1394,6 +1396,10 @@ out:
static void xprt_destroy(struct rpc_xprt *xprt)
{
dprintk("RPC: destroying transport %p\n", xprt);
+
+ /* Exclude transport connect/disconnect handlers */
+ wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_UNINTERRUPTIBLE);
+
del_timer_sync(&xprt->timer);
rpc_xprt_debugfs_unregister(xprt);
--
2.4.3
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com
next prev parent reply other threads:[~2015-09-18 22:00 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-15 15:49 [PATCH] SUNRPC: Fix a race in xs_reset_transport Suzuki K. Poulose
2015-09-15 15:49 ` Suzuki K. Poulose
2015-09-15 18:52 ` Jeff Layton
2015-09-16 8:08 ` Suzuki K. Poulose
2015-09-16 9:04 ` [PATCHv2] " Suzuki K. Poulose
2015-09-16 9:04 ` Suzuki K. Poulose
2015-09-16 9:35 ` Suzuki K. Poulose
2015-09-16 9:35 ` Suzuki K. Poulose
2015-09-16 9:48 ` Marc Zyngier
2015-09-16 11:17 ` Jeff Layton
2015-09-18 11:19 ` Suzuki K. Poulose
2015-09-18 16:51 ` Trond Myklebust
2015-09-18 16:51 ` Trond Myklebust
2015-09-18 22:00 ` Trond Myklebust [this message]
2015-09-18 22:00 ` Trond Myklebust
[not found] ` <20150919080812.063ebf1b@synchrony.poochiereds.net>
2015-09-19 15:07 ` Trond Myklebust
2015-09-21 13:48 ` Suzuki K. Poulose
2015-09-17 13:38 ` [PATCH] " Trond Myklebust
2015-09-17 14:18 ` Jeff Layton
2015-09-17 14:50 ` Trond Myklebust
2015-09-17 14:50 ` Trond Myklebust
2015-09-17 14:59 ` Jeff Layton
2015-09-18 11:16 ` Suzuki K. Poulose
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1442613607.11370.18.camel@primarydata.com \
--to=trond.myklebust@primarydata.com \
--cc=Marc.Zyngier@arm.com \
--cc=Suzuki.Poulose@arm.com \
--cc=anna.schumaker@netapp.com \
--cc=bfields@fieldses.org \
--cc=davem@davemloft.net \
--cc=jlayton@poochiereds.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.