* Concerns about SO_REUSEPORT usage in NFS client
@ 2015-12-17 3:48 NeilBrown
2016-02-18 6:15 ` NeilBrown
0 siblings, 1 reply; 2+ messages in thread
From: NeilBrown @ 2015-12-17 3:48 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs, Takashi Iwai
[-- Attachment #1: Type: text/plain, Size: 1969 bytes --]
Hi Trond et al.
I have concerns about the new use of SO_REUSEPORT in the NFS client.
Partly this is a theoretical concern. The documentation in socket(7)
talks about using this flag on UDP sockets and on TCP sockets in LISTEN
mode, but not about using it with connected TCP sockets. So the NFS
usage isn't covered by the documentation ... maybe fixing the
documentation would relieve that concern.
But there is also a practical concern: it seems to sometime cause
failures.
This is reported here:
https://bugzilla.suse.com/show_bug.cgi?id=959216
I cannot reproduce exactly the same symptoms as described there but I
can get close. I:
- establish an NFSv3 mount to a server
- determine the port number used on the client side
- write numbers to /proc/sys/sunrpc/{min,max}_resvport which bracket
that port number in a range of 10 or so
- try to establish NFSv4 mounts in a loop (unmounting each time)
Then the mount will sometimes hang.
While it is hanging mount.nfs might be in permanently runnable and
"cat /proc/`pidof mount.nfs`/stack" can show:
[<ffffffff81001012>] ___preempt_schedule+0x12/0x14
[<ffffffffffffffff>] 0xffffffffffffffff
I've also sometime seen the stack trace mentioned in the bugzilla
[<ffffffffa030b469>] xprt_connect+0x119/0x170 [sunrpc]
[<ffffffffa0308c06>] call_connect+0x56/0xb0 [sunrpc]
[<ffffffffa0312212>] __rpc_execute+0x82/0x450 [sunrpc]
[<ffffffffa0314fda>] rpc_execute+0x5a/0xb0 [sunrpc]
....
I typically see a 3 minute timeout before the mount fails with
mount.nfs: Connection timed out
My guess is that SO_REUSEPORT can allow the NFSv4 mount to use the same
connection that the NFSv3 mount is using, though over a different socket.
NFSv4 sends a request, the reply is received by the NFSv3 client's socket
which rejects it and the NFSv4 client keeps waiting.
I think that we can only continue to use SO_REUSEPORT if we find a way
to ensure that we don't re-use a currently active connection.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Concerns about SO_REUSEPORT usage in NFS client
2015-12-17 3:48 Concerns about SO_REUSEPORT usage in NFS client NeilBrown
@ 2016-02-18 6:15 ` NeilBrown
0 siblings, 0 replies; 2+ messages in thread
From: NeilBrown @ 2016-02-18 6:15 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs, Takashi Iwai
[-- Attachment #1: Type: text/plain, Size: 2239 bytes --]
Hi Trond (or anyone),
did you get a chance to look at this?
It really seems that the SO_REUSEPORT solution has problems.
Thanks,
NeilBrown
On Thu, Dec 17 2015, NeilBrown wrote:
> Hi Trond et al.
>
> I have concerns about the new use of SO_REUSEPORT in the NFS client.
> Partly this is a theoretical concern. The documentation in socket(7)
> talks about using this flag on UDP sockets and on TCP sockets in LISTEN
> mode, but not about using it with connected TCP sockets. So the NFS
> usage isn't covered by the documentation ... maybe fixing the
> documentation would relieve that concern.
>
> But there is also a practical concern: it seems to sometime cause
> failures.
> This is reported here:
> https://bugzilla.suse.com/show_bug.cgi?id=959216
>
> I cannot reproduce exactly the same symptoms as described there but I
> can get close. I:
> - establish an NFSv3 mount to a server
> - determine the port number used on the client side
> - write numbers to /proc/sys/sunrpc/{min,max}_resvport which bracket
> that port number in a range of 10 or so
> - try to establish NFSv4 mounts in a loop (unmounting each time)
>
> Then the mount will sometimes hang.
> While it is hanging mount.nfs might be in permanently runnable and
> "cat /proc/`pidof mount.nfs`/stack" can show:
>
> [<ffffffff81001012>] ___preempt_schedule+0x12/0x14
> [<ffffffffffffffff>] 0xffffffffffffffff
>
>
> I've also sometime seen the stack trace mentioned in the bugzilla
>
> [<ffffffffa030b469>] xprt_connect+0x119/0x170 [sunrpc]
> [<ffffffffa0308c06>] call_connect+0x56/0xb0 [sunrpc]
> [<ffffffffa0312212>] __rpc_execute+0x82/0x450 [sunrpc]
> [<ffffffffa0314fda>] rpc_execute+0x5a/0xb0 [sunrpc]
> ....
>
> I typically see a 3 minute timeout before the mount fails with
> mount.nfs: Connection timed out
>
> My guess is that SO_REUSEPORT can allow the NFSv4 mount to use the same
> connection that the NFSv3 mount is using, though over a different socket.
> NFSv4 sends a request, the reply is received by the NFSv3 client's socket
> which rejects it and the NFSv4 client keeps waiting.
>
> I think that we can only continue to use SO_REUSEPORT if we find a way
> to ensure that we don't re-use a currently active connection.
>
> NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2016-02-18 6:15 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-17 3:48 Concerns about SO_REUSEPORT usage in NFS client NeilBrown
2016-02-18 6:15 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox