* Re: [PATCH 1/2] SUNRPC: accept() may return sockets that are still in SYN_RECV
[not found] ` <7C18520C-D486-4466-8D9D-FF2052B03F0E-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
@ 2016-07-27 18:48 ` Fields Bruce James
[not found] ` <20160727184806.GA19229-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Fields Bruce James @ 2016-07-27 18:48 UTC (permalink / raw)
To: Trond Myklebust; +Cc: List Linux NFS Mailing, netdev-u79uwXL29TY76Z2rM5mHXA
On Tue, Jul 26, 2016 at 04:08:29PM +0000, Trond Myklebust wrote:
>
> > On Jul 26, 2016, at 11:43, J. Bruce Fields <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:
> >
> > On Tue, Jul 26, 2016 at 09:51:19AM -0400, Trond Myklebust wrote:
> >> We're seeing traces of the following form:
> >>
> >> [10952.396347] svc: transport ffff88042ba4a 000 dequeued, inuse=2
> >> [10952.396351] svc: tcp_accept ffff88042ba4 a000 sock ffff88042a6e4c80
> >> [10952.396362] nfsd: connect from 10.2.6.1, port=187
> >> [10952.396364] svc: svc_setup_socket ffff8800b99bcf00
> >> [10952.396368] setting up TCP socket for reading
> >> [10952.396370] svc: svc_setup_socket created ffff8803eb10a000 (inet ffff88042b75b800)
> >> [10952.396373] svc: transport ffff8803eb10a000 put into queue
> >> [10952.396375] svc: transport ffff88042ba4a000 put into queue
> >> [10952.396377] svc: server ffff8800bb0ec000 waiting for data (to = 3600000)
> >> [10952.396380] svc: transport ffff8803eb10a000 dequeued, inuse=2
> >> [10952.396381] svc_recv: found XPT_CLOSE
> >> [10952.396397] svc: svc_delete_xprt(ffff8803eb10a000)
> >> [10952.396398] svc: svc_tcp_sock_detach(ffff8803eb10a000)
> >> [10952.396399] svc: svc_sock_detach(ffff8803eb10a000)
> >> [10952.396412] svc: svc_sock_free(ffff8803eb10a000)
> >>
> >> i.e. an immediate close of the socket after initialisation.
> >
> > Interesting, thanks!
> >
> > So the one thing I don't understand is why this is correct behavior for
> > accept--I thought it wasn't supposed to return a socket until it was
> > fully established.
>
> inet_accept() appears to allow SYN_RECV:
OK. Cc'ing netdev just to make sure we didn't overlook anything.
(Also: what were user-visible symptoms? Mounts failing, or unexpected
delays?)
--b.
>
> int inet_accept(struct socket *sock, struct socket *newsock, int flags)
> {
> struct sock *sk1 = sock->sk;
> int err = -EINVAL;
> struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, &err);
>
> if (!sk2)
> goto do_err;
>
> lock_sock(sk2);
>
> sock_rps_record_flow(sk2);
> WARN_ON(!((1 << sk2->sk_state) &
> (TCPF_ESTABLISHED | TCPF_SYN_RECV |
> TCPF_CLOSE_WAIT | TCPF_CLOSE)));
>
> sock_graft(sk2, newsock);
>
> newsock->state = SS_CONNECTED;
> err = 0;
> release_sock(sk2);
> do_err:
> return err;
> }
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] SUNRPC: accept() may return sockets that are still in SYN_RECV
[not found] ` <20160727184806.GA19229-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
@ 2016-07-27 18:59 ` Eric Dumazet
[not found] ` <1469645951.17736.13.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2016-07-27 18:59 UTC (permalink / raw)
To: Fields Bruce James
Cc: Trond Myklebust, List Linux NFS Mailing,
netdev-u79uwXL29TY76Z2rM5mHXA, Yuchung Cheng
On Wed, 2016-07-27 at 14:48 -0400, Fields Bruce James wrote:
> On Tue, Jul 26, 2016 at 04:08:29PM +0000, Trond Myklebust wrote:
> >
> > > On Jul 26, 2016, at 11:43, J. Bruce Fields <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:
> > >
> > > On Tue, Jul 26, 2016 at 09:51:19AM -0400, Trond Myklebust wrote:
> > >> We're seeing traces of the following form:
> > >>
> > >> [10952.396347] svc: transport ffff88042ba4a 000 dequeued, inuse=2
> > >> [10952.396351] svc: tcp_accept ffff88042ba4 a000 sock ffff88042a6e4c80
> > >> [10952.396362] nfsd: connect from 10.2.6.1, port=187
> > >> [10952.396364] svc: svc_setup_socket ffff8800b99bcf00
> > >> [10952.396368] setting up TCP socket for reading
> > >> [10952.396370] svc: svc_setup_socket created ffff8803eb10a000 (inet ffff88042b75b800)
> > >> [10952.396373] svc: transport ffff8803eb10a000 put into queue
> > >> [10952.396375] svc: transport ffff88042ba4a000 put into queue
> > >> [10952.396377] svc: server ffff8800bb0ec000 waiting for data (to = 3600000)
> > >> [10952.396380] svc: transport ffff8803eb10a000 dequeued, inuse=2
> > >> [10952.396381] svc_recv: found XPT_CLOSE
> > >> [10952.396397] svc: svc_delete_xprt(ffff8803eb10a000)
> > >> [10952.396398] svc: svc_tcp_sock_detach(ffff8803eb10a000)
> > >> [10952.396399] svc: svc_sock_detach(ffff8803eb10a000)
> > >> [10952.396412] svc: svc_sock_free(ffff8803eb10a000)
> > >>
> > >> i.e. an immediate close of the socket after initialisation.
> > >
> > > Interesting, thanks!
> > >
> > > So the one thing I don't understand is why this is correct behavior for
> > > accept--I thought it wasn't supposed to return a socket until it was
> > > fully established.
> >
> > inet_accept() appears to allow SYN_RECV:
>
> OK. Cc'ing netdev just to make sure we didn't overlook anything.
>
SYN_RECV after accept() is a TCP Fast Open property I think.
Maybe you are playing with some global TCP Fast Open settings ?
> (Also: what were user-visible symptoms? Mounts failing, or unexpected
> delays?)
>
> --b.
>
> >
> > int inet_accept(struct socket *sock, struct socket *newsock, int flags)
> > {
> > struct sock *sk1 = sock->sk;
> > int err = -EINVAL;
> > struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, &err);
> >
> > if (!sk2)
> > goto do_err;
> >
> > lock_sock(sk2);
> >
> > sock_rps_record_flow(sk2);
> > WARN_ON(!((1 << sk2->sk_state) &
> > (TCPF_ESTABLISHED | TCPF_SYN_RECV |
> > TCPF_CLOSE_WAIT | TCPF_CLOSE)));
> >
> > sock_graft(sk2, newsock);
> >
> > newsock->state = SS_CONNECTED;
> > err = 0;
> > release_sock(sk2);
> > do_err:
> > return err;
> > }
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] SUNRPC: accept() may return sockets that are still in SYN_RECV
[not found] ` <1469645951.17736.13.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
@ 2016-07-27 19:11 ` Trond Myklebust
[not found] ` <332ACE13-83D0-4516-9B2D-200250ED3437-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2016-07-27 19:11 UTC (permalink / raw)
To: Eric Dumazet
Cc: Fields Bruce James, List Linux NFS Mailing,
List Linux Network Devel Mailing, Yuchung Cheng
Hi Eric,
> On Jul 27, 2016, at 14:59, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> On Wed, 2016-07-27 at 14:48 -0400, Fields Bruce James wrote:
>> On Tue, Jul 26, 2016 at 04:08:29PM +0000, Trond Myklebust wrote:
>>>
>>>> On Jul 26, 2016, at 11:43, J. Bruce Fields <bfields@fieldses.org> wrote:
>>>>
>>>> On Tue, Jul 26, 2016 at 09:51:19AM -0400, Trond Myklebust wrote:
>>>>> We're seeing traces of the following form:
>>>>>
>>>>> [10952.396347] svc: transport ffff88042ba4a 000 dequeued, inuse=2
>>>>> [10952.396351] svc: tcp_accept ffff88042ba4 a000 sock ffff88042a6e4c80
>>>>> [10952.396362] nfsd: connect from 10.2.6.1, port=187
>>>>> [10952.396364] svc: svc_setup_socket ffff8800b99bcf00
>>>>> [10952.396368] setting up TCP socket for reading
>>>>> [10952.396370] svc: svc_setup_socket created ffff8803eb10a000 (inet ffff88042b75b800)
>>>>> [10952.396373] svc: transport ffff8803eb10a000 put into queue
>>>>> [10952.396375] svc: transport ffff88042ba4a000 put into queue
>>>>> [10952.396377] svc: server ffff8800bb0ec000 waiting for data (to = 3600000)
>>>>> [10952.396380] svc: transport ffff8803eb10a000 dequeued, inuse=2
>>>>> [10952.396381] svc_recv: found XPT_CLOSE
>>>>> [10952.396397] svc: svc_delete_xprt(ffff8803eb10a000)
>>>>> [10952.396398] svc: svc_tcp_sock_detach(ffff8803eb10a000)
>>>>> [10952.396399] svc: svc_sock_detach(ffff8803eb10a000)
>>>>> [10952.396412] svc: svc_sock_free(ffff8803eb10a000)
>>>>>
>>>>> i.e. an immediate close of the socket after initialisation.
>>>>
>>>> Interesting, thanks!
>>>>
>>>> So the one thing I don't understand is why this is correct behavior for
>>>> accept--I thought it wasn't supposed to return a socket until it was
>>>> fully established.
>>>
>>> inet_accept() appears to allow SYN_RECV:
>>
>> OK. Cc'ing netdev just to make sure we didn't overlook anything.
>>
>
> SYN_RECV after accept() is a TCP Fast Open property I think.
>
> Maybe you are playing with some global TCP Fast Open settings ?
>
The Linux kernel client should not be using TCP fast open, but it is possible that some of the other NFSv3 clients we’re using are.
Would a standard knfsd listener respond to a TCP fast open request, or would the default behaviour be to ignore it?
If the default behaviour for the server is to allow fast open, then we do need these patches, IMO.
>
>> (Also: what were user-visible symptoms? Mounts failing, or unexpected
>> delays?)
>>
Connection retry storms on the server.
>> --b.
>>
>>>
>>> int inet_accept(struct socket *sock, struct socket *newsock, int flags)
>>> {
>>> struct sock *sk1 = sock->sk;
>>> int err = -EINVAL;
>>> struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, &err);
>>>
>>> if (!sk2)
>>> goto do_err;
>>>
>>> lock_sock(sk2);
>>>
>>> sock_rps_record_flow(sk2);
>>> WARN_ON(!((1 << sk2->sk_state) &
>>> (TCPF_ESTABLISHED | TCPF_SYN_RECV |
>>> TCPF_CLOSE_WAIT | TCPF_CLOSE)));
>>>
>>> sock_graft(sk2, newsock);
>>>
>>> newsock->state = SS_CONNECTED;
>>> err = 0;
>>> release_sock(sk2);
>>> do_err:
>>> return err;
>>> }
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] SUNRPC: accept() may return sockets that are still in SYN_RECV
[not found] ` <332ACE13-83D0-4516-9B2D-200250ED3437-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
@ 2016-07-28 14:21 ` Fields Bruce James
0 siblings, 0 replies; 4+ messages in thread
From: Fields Bruce James @ 2016-07-28 14:21 UTC (permalink / raw)
To: Trond Myklebust
Cc: Eric Dumazet, List Linux NFS Mailing,
List Linux Network Devel Mailing, Yuchung Cheng
On Wed, Jul 27, 2016 at 07:11:23PM +0000, Trond Myklebust wrote:
> Hi Eric,
>
> > On Jul 27, 2016, at 14:59, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> > On Wed, 2016-07-27 at 14:48 -0400, Fields Bruce James wrote:
> >> On Tue, Jul 26, 2016 at 04:08:29PM +0000, Trond Myklebust wrote:
> >>>
> >>>> On Jul 26, 2016, at 11:43, J. Bruce Fields <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:
> >>>>
> >>>> On Tue, Jul 26, 2016 at 09:51:19AM -0400, Trond Myklebust wrote:
> >>>>> We're seeing traces of the following form:
> >>>>>
> >>>>> [10952.396347] svc: transport ffff88042ba4a 000 dequeued, inuse=2
> >>>>> [10952.396351] svc: tcp_accept ffff88042ba4 a000 sock ffff88042a6e4c80
> >>>>> [10952.396362] nfsd: connect from 10.2.6.1, port=187
> >>>>> [10952.396364] svc: svc_setup_socket ffff8800b99bcf00
> >>>>> [10952.396368] setting up TCP socket for reading
> >>>>> [10952.396370] svc: svc_setup_socket created ffff8803eb10a000 (inet ffff88042b75b800)
> >>>>> [10952.396373] svc: transport ffff8803eb10a000 put into queue
> >>>>> [10952.396375] svc: transport ffff88042ba4a000 put into queue
> >>>>> [10952.396377] svc: server ffff8800bb0ec000 waiting for data (to = 3600000)
> >>>>> [10952.396380] svc: transport ffff8803eb10a000 dequeued, inuse=2
> >>>>> [10952.396381] svc_recv: found XPT_CLOSE
> >>>>> [10952.396397] svc: svc_delete_xprt(ffff8803eb10a000)
> >>>>> [10952.396398] svc: svc_tcp_sock_detach(ffff8803eb10a000)
> >>>>> [10952.396399] svc: svc_sock_detach(ffff8803eb10a000)
> >>>>> [10952.396412] svc: svc_sock_free(ffff8803eb10a000)
> >>>>>
> >>>>> i.e. an immediate close of the socket after initialisation.
> >>>>
> >>>> Interesting, thanks!
> >>>>
> >>>> So the one thing I don't understand is why this is correct behavior for
> >>>> accept--I thought it wasn't supposed to return a socket until it was
> >>>> fully established.
> >>>
> >>> inet_accept() appears to allow SYN_RECV:
> >>
> >> OK. Cc'ing netdev just to make sure we didn't overlook anything.
> >>
> >
> > SYN_RECV after accept() is a TCP Fast Open property I think.
> >
> > Maybe you are playing with some global TCP Fast Open settings ?
> >
>
> The Linux kernel client should not be using TCP fast open, but it is possible that some of the other NFSv3 clients we’re using are.
> Would a standard knfsd listener respond to a TCP fast open request, or would the default behaviour be to ignore it?
>
> If the default behaviour for the server is to allow fast open, then we do need these patches, IMO.
Even if it's not a default, if there's a configuration that allows
accept to return a socket in SYN_RECV state, then knfsd should handle it
gracefully, especially as long as it's this easy.
It'd still be useful to understand why this is happening, though....
--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-07-28 14:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1469541080-4184-1-git-send-email-trond.myklebust@primarydata.com>
[not found] ` <20160726154354.GA6692@fieldses.org>
[not found] ` <7C18520C-D486-4466-8D9D-FF2052B03F0E@primarydata.com>
[not found] ` <7C18520C-D486-4466-8D9D-FF2052B03F0E-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2016-07-27 18:48 ` [PATCH 1/2] SUNRPC: accept() may return sockets that are still in SYN_RECV Fields Bruce James
[not found] ` <20160727184806.GA19229-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2016-07-27 18:59 ` Eric Dumazet
[not found] ` <1469645951.17736.13.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2016-07-27 19:11 ` Trond Myklebust
[not found] ` <332ACE13-83D0-4516-9B2D-200250ED3437-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2016-07-28 14:21 ` Fields Bruce James
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).