From: "bfields@fieldses.org" <bfields@fieldses.org>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
Chuck Lever <chuck.lever@oracle.com>
Subject: Re: [PATCH] SUNRPC: Don't allow compiler optimisation of svc_xprt_release_slot()
Date: Fri, 11 Jan 2019 16:12:35 -0500 [thread overview]
Message-ID: <20190111211235.GA27206@fieldses.org> (raw)
In-Reply-To: <300445038b75d5efafe9391eb4b8e83d9d6e3633.camel@hammerspace.com>
On Wed, Jan 09, 2019 at 05:41:36PM +0000, Trond Myklebust wrote:
> The above is stating that
>
> smp_rmb();
> smp_read_barrier_depends();
> if (xprt->xpt_flags & ....)
>
> is redundant and can be replaced with just
>
> smp_rmb();
> if (xprt->xpt_flags & ....)
>
> However that's not the case for smp_rmb() followed by READ_ONCE(). That
> would expand to
>
> smp_rmb();
> if (xprt->xpt_flags & ...) {
> smp_read_barrier_depends();
> } else
> smp_read_barrier_depends();
>
> which is not redundant. It is ensuring (on alpha only) that the read of
> xprt->xpt_flags is also not re-ordered w.r.t. other data reads that
> follow.
>
> See, for instance, kernel/events/core.c which has several examples, or
> kernel/exit.c.
You're right, I was confused.
So, I think we need your patch plus something like this.
Chuck, maybe you could help me with the "XXX: Chuck:" parts?
(This applies on top of your patch plus another that just renames the
stupidly long svc_xprt_has_something_to_do() to svc_xprt_read().)
--b.
commit d7356c3250d4
Author: J. Bruce Fields <bfields@redhat.com>
Date: Fri Jan 11 15:36:40 2019 -0500
svcrpc: fix unlikely races preventing queueing of sockets
In the rpc server, When something happens that might be reason to wake
up a thread to do something, what we do is
- modify xpt_flags, sk_sock->flags, xpt_reserved, or
xpt_nr_rqsts to indicate the new situation
- call svc_xprt_enqueue() to decide whether to wake up a thread.
svc_xprt_enqueue may require multiple conditions to be true before
queueing up a thread to handle the xprt. In the SMP case, one of the
other CPU's may have set another required condition, and in that case,
although both CPUs run svc_xprt_enqueue(), it's possible that neither
call sees the writes done by the other CPU in time, and neither one
recognizes that all the required conditions have been set. A socket
could therefore be ignored indefinitely.
Add memory barries to ensure that any svc_xprt_enqueue() call will
always see the conditions changed by other CPUs before deciding to
ignore a socket.
I've never seen this race reported. In the unlikely event it happens,
another event will usually come along and the problem will fix itself.
So I don't think this is worth backporting to stable.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index d410ae512b02..2af21b84b3b6 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -357,6 +357,7 @@ static void svc_xprt_release_slot(struct svc_rqst *rqstp)
struct svc_xprt *xprt = rqstp->rq_xprt;
if (test_and_clear_bit(RQ_DATA, &rqstp->rq_flags)) {
atomic_dec(&xprt->xpt_nr_rqsts);
+ smp_wmb(); /* See smp_rmb() in svc_xprt_ready() */
svc_xprt_enqueue(xprt);
}
}
@@ -365,6 +366,15 @@ static bool svc_xprt_ready(struct svc_xprt *xprt)
{
unsigned long xpt_flags;
+ /*
+ * If another cpu has recently updated xpt_flags,
+ * sk_sock->flags, xpt_reserved, or xpt_nr_rqsts, we need to
+ * know about it; otherwise it's possible that both that cpu and
+ * this one could call svc_xprt_enqueue() without either
+ * svc_xprt_enqueue() recognizing that the conditions below
+ * are satisfied, and we could stall indefinitely:
+ */
+ smp_rmb();
READ_ONCE(xprt->xpt_flags);
if (xpt_flags & (BIT(XPT_CONN) | BIT(XPT_CLOSE)))
@@ -479,7 +489,7 @@ void svc_reserve(struct svc_rqst *rqstp, int space)
if (xprt && space < rqstp->rq_reserved) {
atomic_sub((rqstp->rq_reserved - space), &xprt->xpt_reserved);
rqstp->rq_reserved = space;
-
+ smp_wmb(); /* See smp_rmb() in svc_xprt_ready() */
svc_xprt_enqueue(xprt);
}
}
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 828b149eaaef..377244992ae8 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -316,6 +316,7 @@ static void svc_rdma_wc_receive(struct ib_cq *cq, struct ib_wc *wc)
list_add_tail(&ctxt->rc_list, &rdma->sc_rq_dto_q);
spin_unlock(&rdma->sc_rq_dto_lock);
set_bit(XPT_DATA, &rdma->sc_xprt.xpt_flags);
+ /* XXX: Chuck: do we need an smp_mb__after_atomic() here? */
if (!test_bit(RDMAXPRT_CONN_PENDING, &rdma->sc_flags))
svc_xprt_enqueue(&rdma->sc_xprt);
goto out;
diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c
index dc1951759a8e..e1a790487d69 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_rw.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c
@@ -290,6 +290,7 @@ static void svc_rdma_wc_read_done(struct ib_cq *cq, struct ib_wc *wc)
spin_unlock(&rdma->sc_rq_dto_lock);
set_bit(XPT_DATA, &rdma->sc_xprt.xpt_flags);
+ /* XXX: Chuck: do we need a smp_mb__after_atomic() here? */
svc_xprt_enqueue(&rdma->sc_xprt);
}
next prev parent reply other threads:[~2019-01-11 21:12 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-03 14:17 [PATCH] SUNRPC: Don't allow compiler optimisation of svc_xprt_release_slot() Trond Myklebust
2019-01-03 22:45 ` J Bruce Fields
2019-01-03 23:40 ` Trond Myklebust
2019-01-04 17:39 ` bfields
2019-01-07 21:32 ` bfields
2019-01-07 22:06 ` Trond Myklebust
2019-01-08 15:01 ` bfields
2019-01-08 16:21 ` Trond Myklebust
2019-01-09 16:51 ` bfields
2019-01-09 17:41 ` Trond Myklebust
2019-01-11 21:12 ` bfields [this message]
2019-01-11 21:52 ` Chuck Lever
2019-01-11 21:54 ` Chuck Lever
2019-01-11 22:10 ` Bruce Fields
2019-01-11 22:27 ` Chuck Lever
2019-01-12 0:56 ` Bruce Fields
2019-01-14 17:24 ` Chuck Lever
2019-01-25 20:30 ` Bruce Fields
2019-01-25 21:32 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190111211235.GA27206@fieldses.org \
--to=bfields@fieldses.org \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.