* [PATCH] SUNRPC: Fix a race in the receive code path
@ 2017-12-03 18:50 Trond Myklebust
2017-12-03 18:54 ` Chuck Lever
0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2017-12-03 18:50 UTC (permalink / raw)
To: Chuck Lever; +Cc: Anna Schumaker, linux-nfs
We must ensure that the call to rpc_sleep_on() in xprt_transmit() cannot
race with the call to xprt_complete_rqst().
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317
Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
Cc: stable@vger.kernel.org # 4.14+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
net/sunrpc/xprt.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 333b9d697ae5..9d9092805696 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1035,6 +1035,7 @@ void xprt_transmit(struct rpc_task *task)
dprintk("RPC: %5u xmit complete\n", task->tk_pid);
task->tk_flags |= RPC_TASK_SENT;
+ spin_lock(&xprt->recv_lock);
spin_lock_bh(&xprt->transport_lock);
xprt->ops->set_retrans_timeout(task);
@@ -1061,6 +1062,7 @@ void xprt_transmit(struct rpc_task *task)
req->rq_connect_cookie = xprt->connect_cookie;
}
spin_unlock_bh(&xprt->transport_lock);
+ spin_unlock(&xprt->recv_lock);
}
static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task)
--
2.14.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] SUNRPC: Fix a race in the receive code path 2017-12-03 18:50 [PATCH] SUNRPC: Fix a race in the receive code path Trond Myklebust @ 2017-12-03 18:54 ` Chuck Lever 2017-12-03 20:12 ` Trond Myklebust 0 siblings, 1 reply; 7+ messages in thread From: Chuck Lever @ 2017-12-03 18:54 UTC (permalink / raw) To: Trond Myklebust; +Cc: Anna Schumaker, Linux NFS Mailing List > On Dec 3, 2017, at 1:50 PM, Trond Myklebust = <trond.myklebust@primarydata.com> wrote: >=20 > We must ensure that the call to rpc_sleep_on() in xprt_transmit() = cannot > race with the call to xprt_complete_rqst(). :-( this will kill scalability, we might as well just go back to the old locking scheme. But I will give it a try. > Reported-by: Chuck Lever <chuck.lever@oracle.com> > Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D317 > Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..") > Cc: stable@vger.kernel.org # 4.14+ > Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> > --- > net/sunrpc/xprt.c | 2 ++ > 1 file changed, 2 insertions(+) >=20 > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c > index 333b9d697ae5..9d9092805696 100644 > --- a/net/sunrpc/xprt.c > +++ b/net/sunrpc/xprt.c > @@ -1035,6 +1035,7 @@ void xprt_transmit(struct rpc_task *task) >=20 > dprintk("RPC: %5u xmit complete\n", task->tk_pid); > task->tk_flags |=3D RPC_TASK_SENT; > + spin_lock(&xprt->recv_lock); > spin_lock_bh(&xprt->transport_lock); >=20 > xprt->ops->set_retrans_timeout(task); > @@ -1061,6 +1062,7 @@ void xprt_transmit(struct rpc_task *task) > req->rq_connect_cookie =3D xprt->connect_cookie; > } > spin_unlock_bh(&xprt->transport_lock); > + spin_unlock(&xprt->recv_lock); > } >=20 > static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task = *task) > --=20 > 2.14.3 >=20 > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever chucklever@gmail.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] SUNRPC: Fix a race in the receive code path 2017-12-03 18:54 ` Chuck Lever @ 2017-12-03 20:12 ` Trond Myklebust 2017-12-03 20:19 ` Chuck Lever 0 siblings, 1 reply; 7+ messages in thread From: Trond Myklebust @ 2017-12-03 20:12 UTC (permalink / raw) To: chucklever@gmail.com; +Cc: Anna.Schumaker@Netapp.com, linux-nfs@vger.kernel.org T24gU3VuLCAyMDE3LTEyLTAzIGF0IDEzOjU0IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g PiBPbiBEZWMgMywgMjAxNywgYXQgMTo1MCBQTSwgVHJvbmQgTXlrbGVidXN0IDx0cm9uZC5teWts ZWJ1c3RAcHJpbWFyDQo+ID4geWRhdGEuY29tPiB3cm90ZToNCj4gPiANCj4gPiBXZSBtdXN0IGVu c3VyZSB0aGF0IHRoZSBjYWxsIHRvIHJwY19zbGVlcF9vbigpIGluIHhwcnRfdHJhbnNtaXQoKQ0K PiA+IGNhbm5vdA0KPiA+IHJhY2Ugd2l0aCB0aGUgY2FsbCB0byB4cHJ0X2NvbXBsZXRlX3Jxc3Qo KS4NCj4gDQo+IDotKCB0aGlzIHdpbGwga2lsbCBzY2FsYWJpbGl0eSwgd2UgbWlnaHQgYXMgd2Vs bCBqdXN0IGdvIGJhY2sNCj4gdG8gdGhlIG9sZCBsb2NraW5nIHNjaGVtZS4NCg0KSXQgc2hvdWxk bid0IG1ha2UgYSBodWdlIGRpZmZlcmVuY2UsIGJ1dCBJIGFncmVlIHRoYXQgd2UgZG8gd2FudCB0 byBnZXQNCnJpZCBvZiB0aGF0IHRyYW5zcG9ydCBsb2NrLg0KDQpIb3cgYWJvdXQgdGhlIGZvbGxv d2luZywgdGhlbj8NCg0KODwtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCkZyb20gNmEwYzUwN2YxNjBkNTYyNGQ5MDQ5Mjgx Y2Q5ZGZlMjIyYTg2NmYwNiBNb24gU2VwIDE3IDAwOjAwOjAwIDIwMDENCkZyb206IFRyb25kIE15 a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbT4NCkRhdGU6IFN1biwgMyBE ZWMgMjAxNyAxMzozNzoyNyAtMDUwMA0KU3ViamVjdDogW1BBVENIIHYyXSBTVU5SUEM6IEZpeCBh IHJhY2UgaW4gdGhlIHJlY2VpdmUgY29kZSBwYXRoDQoNCldlIG11c3QgZW5zdXJlIHRoYXQgdGhl IGNhbGwgdG8gcnBjX3NsZWVwX29uKCkgaW4geHBydF90cmFuc21pdCgpIGNhbm5vdA0KcmFjZSB3 aXRoIHRoZSBjYWxsIHRvIHhwcnRfY29tcGxldGVfcnFzdCgpLg0KDQpSZXBvcnRlZC1ieTogQ2h1 Y2sgTGV2ZXIgPGNodWNrLmxldmVyQG9yYWNsZS5jb20+DQpMaW5rOiBodHRwczovL2J1Z3ppbGxh LmxpbnV4LW5mcy5vcmcvc2hvd19idWcuY2dpP2lkPTMxNw0KRml4ZXM6IGNlN2MyNTJhOGM3NCAo IlNVTlJQQzogQWRkIGEgc2VwYXJhdGUgc3BpbmxvY2sgdG8gcHJvdGVjdC4uIikNCkNjOiBzdGFi bGVAdmdlci5rZXJuZWwub3JnICMgNC4xNCsNClNpZ25lZC1vZmYtYnk6IFRyb25kIE15a2xlYnVz dCA8dHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbT4NCi0tLQ0KIG5ldC9zdW5ycGMveHBy dC5jIHwgMTggKysrKysrKysrKystLS0tLS0tDQogMSBmaWxlIGNoYW5nZWQsIDExIGluc2VydGlv bnMoKyksIDcgZGVsZXRpb25zKC0pDQoNCmRpZmYgLS1naXQgYS9uZXQvc3VucnBjL3hwcnQuYyBi L25ldC9zdW5ycGMveHBydC5jDQppbmRleCAzMzNiOWQ2OTdhZTUuLjM0ZjYxMzM4NTMxOSAxMDA2 NDQNCi0tLSBhL25ldC9zdW5ycGMveHBydC5jDQorKysgYi9uZXQvc3VucnBjL3hwcnQuYw0KQEAg LTEwMjQsNiArMTAyNCw3IEBAIHZvaWQgeHBydF90cmFuc21pdChzdHJ1Y3QgcnBjX3Rhc2sgKnRh c2spDQogCX0gZWxzZSBpZiAoIXJlcS0+cnFfYnl0ZXNfc2VudCkNCiAJCXJldHVybjsNCiANCisJ cmVxLT5ycV9jb25uZWN0X2Nvb2tpZSA9IHhwcnQtPmNvbm5lY3RfY29va2llOw0KIAlyZXEtPnJx X3h0aW1lID0ga3RpbWVfZ2V0KCk7DQogCXN0YXR1cyA9IHhwcnQtPm9wcy0+c2VuZF9yZXF1ZXN0 KHRhc2spOw0KIAl0cmFjZV94cHJ0X3RyYW5zbWl0KHhwcnQsIHJlcS0+cnFfeGlkLCBzdGF0dXMp Ow0KQEAgLTEwNDgsMTkgKzEwNDksMjIgQEAgdm9pZCB4cHJ0X3RyYW5zbWl0KHN0cnVjdCBycGNf dGFzayAqdGFzaykNCiAJeHBydC0+c3RhdC5zZW5kaW5nX3UgKz0geHBydC0+c2VuZGluZy5xbGVu Ow0KIAl4cHJ0LT5zdGF0LnBlbmRpbmdfdSArPSB4cHJ0LT5wZW5kaW5nLnFsZW47DQogDQotCS8q IERvbid0IHJhY2Ugd2l0aCBkaXNjb25uZWN0ICovDQotCWlmICgheHBydF9jb25uZWN0ZWQoeHBy dCkpDQotCQl0YXNrLT50a19zdGF0dXMgPSAtRU5PVENPTk47DQotCWVsc2Ugew0KKwlzcGluX3Vu bG9ja19iaCgmeHBydC0+dHJhbnNwb3J0X2xvY2spOw0KKw0KKwlpZiAoIVJFQURfT05DRShyZXEt PnJxX3JlcGx5X2J5dGVzX3JlY3ZkKSAmJiBycGNfcmVwbHlfZXhwZWN0ZWQodGFzaykpIHsNCisJ CXNwaW5fbG9jaygmeHBydC0+cmVjdl9sb2NrKTsNCiAJCS8qDQogCQkgKiBTbGVlcCBvbiB0aGUg cGVuZGluZyBxdWV1ZSBzaW5jZQ0KIAkJICogd2UncmUgZXhwZWN0aW5nIGEgcmVwbHkuDQogCQkg Ki8NCi0JCWlmICghcmVxLT5ycV9yZXBseV9ieXRlc19yZWN2ZCAmJiBycGNfcmVwbHlfZXhwZWN0 ZWQodGFzaykpDQorCQlpZiAoIXJlcS0+cnFfcmVwbHlfYnl0ZXNfcmVjdmQpIHsNCiAJCQlycGNf c2xlZXBfb24oJnhwcnQtPnBlbmRpbmcsIHRhc2ssIHhwcnRfdGltZXIpOw0KLQkJcmVxLT5ycV9j b25uZWN0X2Nvb2tpZSA9IHhwcnQtPmNvbm5lY3RfY29va2llOw0KKwkJCS8qIERlYWwgd2l0aCBk aXNjb25uZWN0IHJhY2VzICovDQorCQkJaWYgKCF4cHJ0X2Nvbm5lY3RlZCh4cHJ0KSkNCisJCQkJ eHBydF93YWtlX3BlbmRpbmdfdGFza3MoeHBydCwgLUVOT1RDT05OKTsNCisJCX0NCisJCXNwaW5f dW5sb2NrKCZ4cHJ0LT5yZWN2X2xvY2spOw0KIAl9DQotCXNwaW5fdW5sb2NrX2JoKCZ4cHJ0LT50 cmFuc3BvcnRfbG9jayk7DQogfQ0KIA0KIHN0YXRpYyB2b2lkIHhwcnRfYWRkX2JhY2tsb2coc3Ry dWN0IHJwY194cHJ0ICp4cHJ0LCBzdHJ1Y3QgcnBjX3Rhc2sgKnRhc2spDQotLSANCjIuMTQuMw0K DQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyLCBQcmlt YXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] SUNRPC: Fix a race in the receive code path 2017-12-03 20:12 ` Trond Myklebust @ 2017-12-03 20:19 ` Chuck Lever 2017-12-03 20:24 ` Trond Myklebust 0 siblings, 1 reply; 7+ messages in thread From: Chuck Lever @ 2017-12-03 20:19 UTC (permalink / raw) To: Trond Myklebust; +Cc: Anna Schumaker, Linux NFS Mailing List > On Dec 3, 2017, at 3:12 PM, Trond Myklebust <trondmy@primarydata.com> = wrote: >=20 > On Sun, 2017-12-03 at 13:54 -0500, Chuck Lever wrote: >>> On Dec 3, 2017, at 1:50 PM, Trond Myklebust <trond.myklebust@primar >>> ydata.com> wrote: >>>=20 >>> We must ensure that the call to rpc_sleep_on() in xprt_transmit() >>> cannot >>> race with the call to xprt_complete_rqst(). >>=20 >> :-( this will kill scalability, we might as well just go back >> to the old locking scheme. >=20 > It shouldn't make a huge difference, but I agree that we do want to = get > rid of that transport lock. >=20 > How about the following, then? This looks better, I'll give it a try! But touching the recv_lock in the transmit path is what I'd like to avoid completely, if possible. I'm not clear on why the pending waitqueue is unprotected. Doesn't it have a lock of its own? > 8<------------------------------------------------------------------ > =46rom 6a0c507f160d5624d9049281cd9dfe222a866f06 Mon Sep 17 00:00:00 = 2001 > From: Trond Myklebust <trond.myklebust@primarydata.com> > Date: Sun, 3 Dec 2017 13:37:27 -0500 > Subject: [PATCH v2] SUNRPC: Fix a race in the receive code path >=20 > We must ensure that the call to rpc_sleep_on() in xprt_transmit() = cannot > race with the call to xprt_complete_rqst(). >=20 > Reported-by: Chuck Lever <chuck.lever@oracle.com> > Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D317 > Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..") > Cc: stable@vger.kernel.org # 4.14+ > Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> > --- > net/sunrpc/xprt.c | 18 +++++++++++------- > 1 file changed, 11 insertions(+), 7 deletions(-) >=20 > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c > index 333b9d697ae5..34f613385319 100644 > --- a/net/sunrpc/xprt.c > +++ b/net/sunrpc/xprt.c > @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task) > } else if (!req->rq_bytes_sent) > return; >=20 > + req->rq_connect_cookie =3D xprt->connect_cookie; > req->rq_xtime =3D ktime_get(); > status =3D xprt->ops->send_request(task); > trace_xprt_transmit(xprt, req->rq_xid, status); > @@ -1048,19 +1049,22 @@ void xprt_transmit(struct rpc_task *task) > xprt->stat.sending_u +=3D xprt->sending.qlen; > xprt->stat.pending_u +=3D xprt->pending.qlen; >=20 > - /* Don't race with disconnect */ > - if (!xprt_connected(xprt)) > - task->tk_status =3D -ENOTCONN; > - else { > + spin_unlock_bh(&xprt->transport_lock); > + > + if (!READ_ONCE(req->rq_reply_bytes_recvd) && = rpc_reply_expected(task)) { > + spin_lock(&xprt->recv_lock); > /* > * Sleep on the pending queue since > * we're expecting a reply. > */ > - if (!req->rq_reply_bytes_recvd && = rpc_reply_expected(task)) > + if (!req->rq_reply_bytes_recvd) { > rpc_sleep_on(&xprt->pending, task, xprt_timer); > - req->rq_connect_cookie =3D xprt->connect_cookie; > + /* Deal with disconnect races */ > + if (!xprt_connected(xprt)) > + xprt_wake_pending_tasks(xprt, = -ENOTCONN); > + } > + spin_unlock(&xprt->recv_lock); > } > - spin_unlock_bh(&xprt->transport_lock); > } >=20 > static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task = *task) > --=20 > 2.14.3 >=20 > --=20 > Trond Myklebust > Linux NFS client maintainer, PrimaryData > trond.myklebust@primarydata.com > = N=C2=8B=C2=A7=C4=93=C3=A6=C4=97r=C4=BC=C2=9By=C3=BA=C4=8D=C2=9A=C3=98b=C4=93= X=C5=BD=C4=B7=C4=AE=C2=A7v=C3=98^=C2=96)=C3=9E=C5=A1{.n=C4=AE+=C2=89=C2=B7= =C4=A8=C2=8A{=C4=85=C2=9D=C3=BB"=C2=9E=C3=98^n=C2=87r=C4=84=C3=B6=C4=B6z=C3= =8B=1A=C2=81=C3=ABh=C2=99=C4=BB=C4=8D=C2=AD=C3=9A&=C4=92=C3=B8=1E=C5=AAG=C5= =A6=C2=9D=C3=A9h=C5=AA=03(=C2=AD=C3=A9=C2=9A=C2=8E=C2=8A=C3=9D=C4=92j"=C2=9D= =C3=BA=1A=C4=B7=1Bm=C2=A7=C4=B8=C3=AF=C2=81=C4=99=C3=A4z=C4=91=C3=9E=C2=96= =C2=8A=C4=81=C3=BEf=C4=A2=C4=92=C2=B7h=C2=9A=C2=88=C2=A7~=C2=88m -- Chuck Lever chucklever@gmail.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] SUNRPC: Fix a race in the receive code path 2017-12-03 20:19 ` Chuck Lever @ 2017-12-03 20:24 ` Trond Myklebust 2017-12-03 23:33 ` Chuck Lever 0 siblings, 1 reply; 7+ messages in thread From: Trond Myklebust @ 2017-12-03 20:24 UTC (permalink / raw) To: chucklever@gmail.com; +Cc: Anna.Schumaker@Netapp.com, linux-nfs@vger.kernel.org T24gU3VuLCAyMDE3LTEyLTAzIGF0IDE1OjE5IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g PiBPbiBEZWMgMywgMjAxNywgYXQgMzoxMiBQTSwgVHJvbmQgTXlrbGVidXN0IDx0cm9uZG15QHBy aW1hcnlkYXRhLmNvDQo+ID4gbT4gd3JvdGU6DQo+ID4gDQo+ID4gT24gU3VuLCAyMDE3LTEyLTAz IGF0IDEzOjU0IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4gPiA+ID4gT24gRGVjIDMsIDIw MTcsIGF0IDE6NTAgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0QHByDQo+ID4g PiA+IGltYXINCj4gPiA+ID4geWRhdGEuY29tPiB3cm90ZToNCj4gPiA+ID4gDQo+ID4gPiA+IFdl IG11c3QgZW5zdXJlIHRoYXQgdGhlIGNhbGwgdG8gcnBjX3NsZWVwX29uKCkgaW4NCj4gPiA+ID4g eHBydF90cmFuc21pdCgpDQo+ID4gPiA+IGNhbm5vdA0KPiA+ID4gPiByYWNlIHdpdGggdGhlIGNh bGwgdG8geHBydF9jb21wbGV0ZV9ycXN0KCkuDQo+ID4gPiANCj4gPiA+IDotKCB0aGlzIHdpbGwg a2lsbCBzY2FsYWJpbGl0eSwgd2UgbWlnaHQgYXMgd2VsbCBqdXN0IGdvIGJhY2sNCj4gPiA+IHRv IHRoZSBvbGQgbG9ja2luZyBzY2hlbWUuDQo+ID4gDQo+ID4gSXQgc2hvdWxkbid0IG1ha2UgYSBo dWdlIGRpZmZlcmVuY2UsIGJ1dCBJIGFncmVlIHRoYXQgd2UgZG8gd2FudCB0bw0KPiA+IGdldA0K PiA+IHJpZCBvZiB0aGF0IHRyYW5zcG9ydCBsb2NrLg0KPiA+IA0KPiA+IEhvdyBhYm91dCB0aGUg Zm9sbG93aW5nLCB0aGVuPw0KPiANCj4gVGhpcyBsb29rcyBiZXR0ZXIsIEknbGwgZ2l2ZSBpdCBh IHRyeSENCj4gDQo+IEJ1dCB0b3VjaGluZyB0aGUgcmVjdl9sb2NrIGluIHRoZSB0cmFuc21pdCBw YXRoIGlzIHdoYXQgSSdkIGxpa2UNCj4gdG8gYXZvaWQgY29tcGxldGVseSwgaWYgcG9zc2libGUu IEknbSBub3QgY2xlYXIgb24gd2h5IHRoZSBwZW5kaW5nDQo+IHdhaXRxdWV1ZSBpcyB1bnByb3Rl Y3RlZC4gRG9lc24ndCBpdCBoYXZlIGEgbG9jayBvZiBpdHMgb3duPw0KDQpUaGUgcmVjdl9sb2Nr IGVuc3VyZXMgdGhhdCB0aGUgY2hlY2sgb2YgcmVxLT5ycV9yZXBseV9ieXRlc19yZWN2ZCBpcw0K YXRvbWljIHdpdGggdGhlIGNhbGwgdG8gcnBjX3NsZWVwX29uKCkuDQoNCi0tIA0KVHJvbmQgTXlr bGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWluZXIsIFByaW1hcnlEYXRhDQp0cm9uZC5t eWtsZWJ1c3RAcHJpbWFyeWRhdGEuY29tDQo= ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] SUNRPC: Fix a race in the receive code path 2017-12-03 20:24 ` Trond Myklebust @ 2017-12-03 23:33 ` Chuck Lever 2017-12-04 0:00 ` Trond Myklebust 0 siblings, 1 reply; 7+ messages in thread From: Chuck Lever @ 2017-12-03 23:33 UTC (permalink / raw) To: Trond Myklebust; +Cc: Anna Schumaker, Linux NFS Mailing List > On Dec 3, 2017, at 3:24 PM, Trond Myklebust <trondmy@primarydata.com> = wrote: >=20 > On Sun, 2017-12-03 at 15:19 -0500, Chuck Lever wrote: >>> On Dec 3, 2017, at 3:12 PM, Trond Myklebust <trondmy@primarydata.co >>> m> wrote: >>>=20 >>> On Sun, 2017-12-03 at 13:54 -0500, Chuck Lever wrote: >>>>> On Dec 3, 2017, at 1:50 PM, Trond Myklebust <trond.myklebust@pr >>>>> imar >>>>> ydata.com> wrote: >>>>>=20 >>>>> We must ensure that the call to rpc_sleep_on() in >>>>> xprt_transmit() >>>>> cannot >>>>> race with the call to xprt_complete_rqst(). >>>>=20 >>>> :-( this will kill scalability, we might as well just go back >>>> to the old locking scheme. >>>=20 >>> It shouldn't make a huge difference, but I agree that we do want to >>> get >>> rid of that transport lock. >>>=20 >>> How about the following, then? >>=20 >> This looks better, I'll give it a try! After testing for several hours, I was not able to reproduce the lost RPC completion symptom. I've been concerned about the performance impact. 8-thread dbench throughput regresses 2-3% on my 56Gb/s network. There will probably not be any noticeable degradation on a slower fabric. Tested-by: Chuck Lever <chuck.lever@oracle.com> >>=20 >> But touching the recv_lock in the transmit path is what I'd like >> to avoid completely, if possible. I'm not clear on why the pending >> waitqueue is unprotected. Doesn't it have a lock of its own? >=20 > The recv_lock ensures that the check of req->rq_reply_bytes_recvd is > atomic with the call to rpc_sleep_on(). Ah, makes sense. I agree it is an oversight in ce7c252a8c7 not to have made this change at the same time. > From: Trond Myklebust <trond.myklebust@primarydata.com> > Date: Sun, 3 Dec 2017 13:37:27 -0500 > Subject: [PATCH v2] SUNRPC: Fix a race in the receive code path >=20 > We must ensure that the call to rpc_sleep_on() in xprt_transmit() = cannot > race with the call to xprt_complete_rqst(). IMHO the patch description could mention rq_reply_bytes_recvd, which is the data that is protected by the recv_lock. More bread crumbs for our future selves... > @@ -1048,19 +1049,22 @@ void xprt_transmit(struct rpc_task *task) > xprt->stat.sending_u +=3D xprt->sending.qlen; > xprt->stat.pending_u +=3D xprt->pending.qlen; >=20 > - /* Don't race with disconnect */ > - if (!xprt_connected(xprt)) > - task->tk_status =3D -ENOTCONN; > - else { > + spin_unlock_bh(&xprt->transport_lock); > + > + if (!READ_ONCE(req->rq_reply_bytes_recvd) && = rpc_reply_expected(task)) { I was wondering about this optimization. It seems to provide about a 1% benefit in dbench throughput results in my setup, though it is certainly not a common case that the reply has arrived at this point. > + spin_lock(&xprt->recv_lock); > /* > * Sleep on the pending queue since > * we're expecting a reply. > */ > - if (!req->rq_reply_bytes_recvd && = rpc_reply_expected(task)) > + if (!req->rq_reply_bytes_recvd) { > rpc_sleep_on(&xprt->pending, task, xprt_timer); > - req->rq_connect_cookie =3D xprt->connect_cookie; > + /* Deal with disconnect races */ > + if (!xprt_connected(xprt)) > + xprt_wake_pending_tasks(xprt, = -ENOTCONN); Here, the !xprt_connected test is no longer done unconditionally, but only when the task has to be put to sleep. Is that a 100% safe change? I'm also wondering if this check can be omitted. Won't disconnect wait until xprt_release_xprt ? Otherwise if this check is still needed, the new comment could be a little more explanatory :-) > + } > + spin_unlock(&xprt->recv_lock); > } > - spin_unlock_bh(&xprt->transport_lock); > } Reviewed-by: Chuck Lever <chuck.lever@oracle.com> And thanks for your quick response! -- Chuck Lever ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] SUNRPC: Fix a race in the receive code path 2017-12-03 23:33 ` Chuck Lever @ 2017-12-04 0:00 ` Trond Myklebust 0 siblings, 0 replies; 7+ messages in thread From: Trond Myklebust @ 2017-12-04 0:00 UTC (permalink / raw) To: chuck.lever@oracle.com Cc: Anna.Schumaker@Netapp.com, linux-nfs@vger.kernel.org T24gU3VuLCAyMDE3LTEyLTAzIGF0IDE4OjMzIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g PiBPbiBEZWMgMywgMjAxNywgYXQgMzoyNCBQTSwgVHJvbmQgTXlrbGVidXN0IDx0cm9uZG15QHBy aW1hcnlkYXRhLmNvDQo+ID4gbT4gd3JvdGU6DQo+ID4gDQo+ID4gT24gU3VuLCAyMDE3LTEyLTAz IGF0IDE1OjE5IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4gPiA+ID4gT24gRGVjIDMsIDIw MTcsIGF0IDM6MTIgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmRteUBwcmltYXJ5ZGF0DQo+ID4g PiA+IGEuY28NCj4gPiA+ID4gbT4gd3JvdGU6DQo+ID4gPiA+IA0KPiA+ID4gPiBPbiBTdW4sIDIw MTctMTItMDMgYXQgMTM6NTQgLTA1MDAsIENodWNrIExldmVyIHdyb3RlOg0KPiA+ID4gPiA+ID4g T24gRGVjIDMsIDIwMTcsIGF0IDE6NTAgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVi dXMNCj4gPiA+ID4gPiA+IHRAcHINCj4gPiA+ID4gPiA+IGltYXINCj4gPiA+ID4gPiA+IHlkYXRh LmNvbT4gd3JvdGU6DQo+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IFdlIG11c3QgZW5zdXJlIHRo YXQgdGhlIGNhbGwgdG8gcnBjX3NsZWVwX29uKCkgaW4NCj4gPiA+ID4gPiA+IHhwcnRfdHJhbnNt aXQoKQ0KPiA+ID4gPiA+ID4gY2Fubm90DQo+ID4gPiA+ID4gPiByYWNlIHdpdGggdGhlIGNhbGwg dG8geHBydF9jb21wbGV0ZV9ycXN0KCkuDQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4gOi0oIHRoaXMg d2lsbCBraWxsIHNjYWxhYmlsaXR5LCB3ZSBtaWdodCBhcyB3ZWxsIGp1c3QgZ28gYmFjaw0KPiA+ ID4gPiA+IHRvIHRoZSBvbGQgbG9ja2luZyBzY2hlbWUuDQo+ID4gPiA+IA0KPiA+ID4gPiBJdCBz aG91bGRuJ3QgbWFrZSBhIGh1Z2UgZGlmZmVyZW5jZSwgYnV0IEkgYWdyZWUgdGhhdCB3ZSBkbw0K PiA+ID4gPiB3YW50IHRvDQo+ID4gPiA+IGdldA0KPiA+ID4gPiByaWQgb2YgdGhhdCB0cmFuc3Bv cnQgbG9jay4NCj4gPiA+ID4gDQo+ID4gPiA+IEhvdyBhYm91dCB0aGUgZm9sbG93aW5nLCB0aGVu Pw0KPiA+ID4gDQo+ID4gPiBUaGlzIGxvb2tzIGJldHRlciwgSSdsbCBnaXZlIGl0IGEgdHJ5IQ0K PiANCj4gQWZ0ZXIgdGVzdGluZyBmb3Igc2V2ZXJhbCBob3VycywgSSB3YXMgbm90IGFibGUgdG8g cmVwcm9kdWNlIHRoZQ0KPiBsb3N0IFJQQyBjb21wbGV0aW9uIHN5bXB0b20uDQo+IA0KPiBJJ3Zl IGJlZW4gY29uY2VybmVkIGFib3V0IHRoZSBwZXJmb3JtYW5jZSBpbXBhY3QuIDgtdGhyZWFkIGRi ZW5jaA0KPiB0aHJvdWdocHV0IHJlZ3Jlc3NlcyAyLTMlIG9uIG15IDU2R2IvcyBuZXR3b3JrLiBU aGVyZSB3aWxsIHByb2JhYmx5DQo+IG5vdCBiZSBhbnkgbm90aWNlYWJsZSBkZWdyYWRhdGlvbiBv biBhIHNsb3dlciBmYWJyaWMuDQo+IA0KPiBUZXN0ZWQtYnk6IENodWNrIExldmVyIDxjaHVjay5s ZXZlckBvcmFjbGUuY29tPg0KPiANCj4gDQo+ID4gPiANCj4gPiA+IEJ1dCB0b3VjaGluZyB0aGUg cmVjdl9sb2NrIGluIHRoZSB0cmFuc21pdCBwYXRoIGlzIHdoYXQgSSdkIGxpa2UNCj4gPiA+IHRv IGF2b2lkIGNvbXBsZXRlbHksIGlmIHBvc3NpYmxlLiBJJ20gbm90IGNsZWFyIG9uIHdoeSB0aGUN Cj4gPiA+IHBlbmRpbmcNCj4gPiA+IHdhaXRxdWV1ZSBpcyB1bnByb3RlY3RlZC4gRG9lc24ndCBp dCBoYXZlIGEgbG9jayBvZiBpdHMgb3duPw0KPiA+IA0KPiA+IFRoZSByZWN2X2xvY2sgZW5zdXJl cyB0aGF0IHRoZSBjaGVjayBvZiByZXEtPnJxX3JlcGx5X2J5dGVzX3JlY3ZkDQo+ID4gaXMNCj4g PiBhdG9taWMgd2l0aCB0aGUgY2FsbCB0byBycGNfc2xlZXBfb24oKS4NCj4gDQo+IEFoLCBtYWtl cyBzZW5zZS4gSSBhZ3JlZSBpdCBpcyBhbiBvdmVyc2lnaHQgaW4gY2U3YzI1MmE4Yzcgbm90IHRv DQo+IGhhdmUgbWFkZSB0aGlzIGNoYW5nZSBhdCB0aGUgc2FtZSB0aW1lLg0KPiANCj4gPiBGcm9t OiBUcm9uZCBNeWtsZWJ1c3QgPHRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20+DQo+ID4g RGF0ZTogU3VuLCAzIERlYyAyMDE3IDEzOjM3OjI3IC0wNTAwDQo+ID4gU3ViamVjdDogW1BBVENI IHYyXSBTVU5SUEM6IEZpeCBhIHJhY2UgaW4gdGhlIHJlY2VpdmUgY29kZSBwYXRoDQo+ID4gDQo+ ID4gV2UgbXVzdCBlbnN1cmUgdGhhdCB0aGUgY2FsbCB0byBycGNfc2xlZXBfb24oKSBpbiB4cHJ0 X3RyYW5zbWl0KCkNCj4gPiBjYW5ub3QNCj4gPiByYWNlIHdpdGggdGhlIGNhbGwgdG8geHBydF9j b21wbGV0ZV9ycXN0KCkuDQo+IA0KPiBJTUhPIHRoZSBwYXRjaCBkZXNjcmlwdGlvbiBjb3VsZCBt ZW50aW9uIHJxX3JlcGx5X2J5dGVzX3JlY3ZkLCB3aGljaA0KPiBpcyB0aGUgZGF0YSB0aGF0IGlz IHByb3RlY3RlZCBieSB0aGUgcmVjdl9sb2NrLiBNb3JlIGJyZWFkIGNydW1icw0KPiBmb3Igb3Vy IGZ1dHVyZSBzZWx2ZXMuLi4NCj4gDQpJJ2xsIHB1dCBhIGNvbW1lbnQgaW4gdGhlIGNvZGUgaXRz ZWxmLiBUaGF0IG1ha2VzIHRoZSBjb2RlIGVhc2llciB0bw0KcmV2aWV3Lg0KDQo+IA0KPiA+IEBA IC0xMDQ4LDE5ICsxMDQ5LDIyIEBAIHZvaWQgeHBydF90cmFuc21pdChzdHJ1Y3QgcnBjX3Rhc2sg KnRhc2spDQo+ID4gCXhwcnQtPnN0YXQuc2VuZGluZ191ICs9IHhwcnQtPnNlbmRpbmcucWxlbjsN Cj4gPiAJeHBydC0+c3RhdC5wZW5kaW5nX3UgKz0geHBydC0+cGVuZGluZy5xbGVuOw0KPiA+IA0K PiA+IC0JLyogRG9uJ3QgcmFjZSB3aXRoIGRpc2Nvbm5lY3QgKi8NCj4gPiAtCWlmICgheHBydF9j b25uZWN0ZWQoeHBydCkpDQo+ID4gLQkJdGFzay0+dGtfc3RhdHVzID0gLUVOT1RDT05OOw0KPiA+ IC0JZWxzZSB7DQo+ID4gKwlzcGluX3VubG9ja19iaCgmeHBydC0+dHJhbnNwb3J0X2xvY2spOw0K PiA+ICsNCj4gPiArCWlmICghUkVBRF9PTkNFKHJlcS0+cnFfcmVwbHlfYnl0ZXNfcmVjdmQpICYm DQo+ID4gcnBjX3JlcGx5X2V4cGVjdGVkKHRhc2spKSB7DQo+IA0KPiBJIHdhcyB3b25kZXJpbmcg YWJvdXQgdGhpcyBvcHRpbWl6YXRpb24uIEl0IHNlZW1zIHRvIHByb3ZpZGUgYWJvdXQgYQ0KPiAx JSBiZW5lZml0IGluIGRiZW5jaCB0aHJvdWdocHV0IHJlc3VsdHMgaW4gbXkgc2V0dXAsIHRob3Vn aCBpdCBpcw0KPiBjZXJ0YWlubHkgbm90IGEgY29tbW9uIGNhc2UgdGhhdCB0aGUgcmVwbHkgaGFz IGFycml2ZWQgYXQgdGhpcyBwb2ludC4NCg0KSXQncyBub3QgdG9vIGxpa2VseSBhbiBvY2N1cnJl bmNlIGZvciBtb3N0IHNldHVwcywgYnV0IGl0IGNvc3RzIGxpdHRsZQ0KdG8gZG8gdGhhdCBjaGVj aywgYW5kIGl0IGRvZXMgbWVhbiB0aGF0IHdlIGNhbiBvcHRpbWlzZSBhd2F5IHRoZSBjYXNlDQpv ZiBSUEMgY2FsbHMgdGhhdCBleHBlY3Qgbm8gcmVwbHkuDQoNCj4gPiArCQlzcGluX2xvY2soJnhw cnQtPnJlY3ZfbG9jayk7DQo+ID4gCQkvKg0KPiA+IAkJICogU2xlZXAgb24gdGhlIHBlbmRpbmcg cXVldWUgc2luY2UNCj4gPiAJCSAqIHdlJ3JlIGV4cGVjdGluZyBhIHJlcGx5Lg0KPiA+IAkJICov DQo+ID4gLQkJaWYgKCFyZXEtPnJxX3JlcGx5X2J5dGVzX3JlY3ZkICYmDQo+ID4gcnBjX3JlcGx5 X2V4cGVjdGVkKHRhc2spKQ0KPiA+ICsJCWlmICghcmVxLT5ycV9yZXBseV9ieXRlc19yZWN2ZCkg ew0KPiA+IAkJCXJwY19zbGVlcF9vbigmeHBydC0+cGVuZGluZywgdGFzaywgeHBydF90aW1lcik7 DQo+ID4gLQkJcmVxLT5ycV9jb25uZWN0X2Nvb2tpZSA9IHhwcnQtPmNvbm5lY3RfY29va2llOw0K PiA+ICsJCQkvKiBEZWFsIHdpdGggZGlzY29ubmVjdCByYWNlcyAqLw0KPiA+ICsJCQlpZiAoIXhw cnRfY29ubmVjdGVkKHhwcnQpKQ0KPiA+ICsJCQkJeHBydF93YWtlX3BlbmRpbmdfdGFza3MoeHBy dCwNCj4gPiAtRU5PVENPTk4pOw0KPiANCj4gSGVyZSwgdGhlICF4cHJ0X2Nvbm5lY3RlZCB0ZXN0 IGlzIG5vIGxvbmdlciBkb25lIHVuY29uZGl0aW9uYWxseSwNCj4gYnV0IG9ubHkgd2hlbiB0aGUg dGFzayBoYXMgdG8gYmUgcHV0IHRvIHNsZWVwLiBJcyB0aGF0IGEgMTAwJSBzYWZlDQo+IGNoYW5n ZT8NCg0KWWVzLiBXZSBvbmx5IGNhcmUgYWJvdXQgdGhlIHJhY2UgY2FzZSBmb3IgdGhpcyBwYXJ0 aWN1bGFyIFJQQyBjYWxsLiBBbnkNCm90aGVyIFJQQ3MgdGhhdCBhcmUgYWxyZWFkeSB3YWl0aW5n IG9uIHhwcnQtPnBlbmRpbmcgd2lsbCBiZSB3b2tlbiBieQ0KdGhlIHRyYW5zcG9ydCBsYXllciBp dHNlbGYuDQoNCj4gSSdtIGFsc28gd29uZGVyaW5nIGlmIHRoaXMgY2hlY2sgY2FuIGJlIG9taXR0 ZWQuIFdvbid0IGRpc2Nvbm5lY3QNCj4gd2FpdCB1bnRpbCB4cHJ0X3JlbGVhc2VfeHBydCA/DQoN CklmIHRoZXJlIGFyZSBubyBvdGhlciBSUEMgY2FsbHMgcGVuZGluZyB0byBkcml2ZSBhIHJlY29u bmVjdCwgdGhlbiB3ZQ0KY2FuIGVuZCB1cCB3YWl0aW5nIGZvciB0aGlzIFJQQyBjYWxsIHRvIHRp bWUgb3V0IGlmIHdlIHJhY2UgaGVyZS4NCg0KPiANCj4gT3RoZXJ3aXNlIGlmIHRoaXMgY2hlY2sg aXMgc3RpbGwgbmVlZGVkLCB0aGUgbmV3IGNvbW1lbnQgY291bGQgYmUNCj4gYSBsaXR0bGUgbW9y ZSBleHBsYW5hdG9yeSA6LSkNCj4gDQo+IA0KPiA+ICsJCX0NCj4gPiArCQlzcGluX3VubG9jaygm eHBydC0+cmVjdl9sb2NrKTsNCj4gPiAJfQ0KPiA+IC0Jc3Bpbl91bmxvY2tfYmgoJnhwcnQtPnRy YW5zcG9ydF9sb2NrKTsNCj4gPiB9DQo+IA0KPiANCj4gUmV2aWV3ZWQtYnk6IENodWNrIExldmVy IDxjaHVjay5sZXZlckBvcmFjbGUuY29tPg0KPiANCj4gQW5kIHRoYW5rcyBmb3IgeW91ciBxdWlj ayByZXNwb25zZSENCj4gDQo+IA0KPiAtLQ0KPiBDaHVjayBMZXZlcg0KPiANCj4gDQo+IA0KLS0g DQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lciwgUHJpbWFyeURh dGENCnRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20NCg== ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-12-04 0:00 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-12-03 18:50 [PATCH] SUNRPC: Fix a race in the receive code path Trond Myklebust 2017-12-03 18:54 ` Chuck Lever 2017-12-03 20:12 ` Trond Myklebust 2017-12-03 20:19 ` Chuck Lever 2017-12-03 20:24 ` Trond Myklebust 2017-12-03 23:33 ` Chuck Lever 2017-12-04 0:00 ` Trond Myklebust
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.