* big send queues on NFS server
@ 2013-06-18 13:48 mcr
2013-06-18 17:25 ` J. Bruce Fields
0 siblings, 1 reply; 3+ messages in thread
From: mcr @ 2013-06-18 13:48 UTC (permalink / raw)
To: linux-nfs@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 3044 bytes --]
Hi, I have been an NFS user and enthusiast for 20+ years.
My home systems still have the numerical uid that doe.carleton.ca
assigned me back in 1989... cause of NFS... Recently, I turned off
a NetBSD 5 machine that was my NFS server, and everything is on a
Linux/Ubuntu server, LVM+raid setup.
I have a slightly interesting setup at my home. A VM with a public IP
(cassidy) address runs a custom web server on port 81 to stream mp3/ogg to
whatever device needs it. My music skips/pauses. Some of this was traced
down to bufferbloat issues when I was listening from work. But, it's
happening at my home desk, connected by Gb/E. An issue with an IPv6 RA
server was ruled out.
To be clear:
desktop(obiwan)---IPv4:81---->server(cassidy)---NFSv4-IPv6-->herring
I am running a tmux ("screen") on NFS server, with one pane being:
watch 'ss -tan | grep 2049'
And in the other, initially, I was running:
sudo tcpdump -i eth0 -n -p ether host ETHERNETOFCASSIDY
as that was very busy, I ran instead:
sudo tcpdump -i eth0 -n -p ether host 00:16:3e:11:22:e4 and \
'(tcp[13] & 2!=0 or ip6[53]&2 !=0)'
and each time the music stops I see huge xmit queues on the NFS server,
ESTAB 0 789156 2607:dead:f:2::231:2049 2607:dead:f:2:216:3eff:fe11:22e4:868
*usually* that then results in a TCP restart:
09:40:12.701402 IP6 2607:dead:f:2:216:3eff:fe11:22e4.868 >
2607:dead:f:2::231.2049: Flags [S], seq 2570499549, win 5712, options [mss
1440,sackOK,TS val 2994659072 ecr 1552097470,nop,wscale 2], length 0
09:40:12.701456 IP6 2607:dead:f:2::231.2049 >
2607:dead:f:2:216:3eff:fe11:22e4.868: Flags [S.], seq 707413120, ack
2570499550, win 14280, options [mss 1440,sackOK,TS val 1552097470 ecr
2994659072,nop,wscale 7], length 0
I notice that it always seem to use the same source port number.
I didn't think that this was allowed until after 2*RTT.
What seems to be occuring to me is some kind of head of queue problem in the
TCP stream. I would be happy to install experimental kernels, instrument
stuff, whatever..., particularly on the NFS client, as it's not a critical
machine. If I need to do something on the NFS server, it will possible.
I will shortly update the kernel the debian backports on the client.
I watch and I regularly see large (+1M) send queues on the server:
ESTAB 0 1434080 2607:dead:f:2::231:2049 2607:dead:f:2:216:3eff:fe11:22e4:868
If they decline in time, there is no interruption, otherwise, the web server
gets an underrun, and the music stops.
I could also capture the entire NFS stream, or just do TCP window analysis on
this stream, but I would suspect that it's a problem on the client.
NFS server:
herring-[~] mcr 1001 %uname -a
Linux herring 3.2.0-39-generic #62-Ubuntu SMP Thu Feb 28 00:28:53 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux
NFS client:
cassidy-[~] mcr 1010 %uname -a
Linux cassidy.sandelman.ca 2.6.32-5-xen-686 #1 SMP Wed May 18 09:43:15 UTC
2011 i686 GNU/Linux
[-- Attachment #2: Type: application/pgp-signature, Size: 307 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: big send queues on NFS server
2013-06-18 13:48 big send queues on NFS server mcr
@ 2013-06-18 17:25 ` J. Bruce Fields
2013-06-18 19:32 ` mcr
0 siblings, 1 reply; 3+ messages in thread
From: J. Bruce Fields @ 2013-06-18 17:25 UTC (permalink / raw)
To: mcr; +Cc: linux-nfs@vger.kernel.org
On Tue, Jun 18, 2013 at 09:48:31AM -0400, mcr@sandelman.ca wrote:
>
> Hi, I have been an NFS user and enthusiast for 20+ years.
> My home systems still have the numerical uid that doe.carleton.ca
> assigned me back in 1989... cause of NFS... Recently, I turned off
> a NetBSD 5 machine that was my NFS server, and everything is on a
> Linux/Ubuntu server, LVM+raid setup.
>
> I have a slightly interesting setup at my home. A VM with a public IP
> (cassidy) address runs a custom web server on port 81 to stream mp3/ogg to
> whatever device needs it. My music skips/pauses. Some of this was traced
> down to bufferbloat issues when I was listening from work. But, it's
> happening at my home desk, connected by Gb/E. An issue with an IPv6 RA
> server was ruled out.
>
> To be clear:
> desktop(obiwan)---IPv4:81---->server(cassidy)---NFSv4-IPv6-->herring
>
> I am running a tmux ("screen") on NFS server, with one pane being:
> watch 'ss -tan | grep 2049'
>
> And in the other, initially, I was running:
> sudo tcpdump -i eth0 -n -p ether host ETHERNETOFCASSIDY
>
> as that was very busy, I ran instead:
> sudo tcpdump -i eth0 -n -p ether host 00:16:3e:11:22:e4 and \
> '(tcp[13] & 2!=0 or ip6[53]&2 !=0)'
>
> and each time the music stops I see huge xmit queues on the NFS server,
>
> ESTAB 0 789156 2607:dead:f:2::231:2049 2607:dead:f:2:216:3eff:fe11:22e4:868
>
> *usually* that then results in a TCP restart:
>
> 09:40:12.701402 IP6 2607:dead:f:2:216:3eff:fe11:22e4.868 >
> 2607:dead:f:2::231.2049: Flags [S], seq 2570499549, win 5712, options [mss
> 1440,sackOK,TS val 2994659072 ecr 1552097470,nop,wscale 2], length 0
>
> 09:40:12.701456 IP6 2607:dead:f:2::231.2049 >
> 2607:dead:f:2:216:3eff:fe11:22e4.868: Flags [S.], seq 707413120, ack
> 2570499550, win 14280, options [mss 1440,sackOK,TS val 1552097470 ecr
> 2994659072,nop,wscale 7], length 0
>
> I notice that it always seem to use the same source port number.
> I didn't think that this was allowed until after 2*RTT.
>
> What seems to be occuring to me is some kind of head of queue problem in the
> TCP stream. I would be happy to install experimental kernels, instrument
> stuff, whatever..., particularly on the NFS client, as it's not a critical
> machine. If I need to do something on the NFS server, it will possible.
> I will shortly update the kernel the debian backports on the client.
>
> I watch and I regularly see large (+1M) send queues on the server:
>
> ESTAB 0 1434080 2607:dead:f:2::231:2049 2607:dead:f:2:216:3eff:fe11:22e4:868
>
> If they decline in time, there is no interruption, otherwise, the web server
> gets an underrun, and the music stops.
>
> I could also capture the entire NFS stream, or just do TCP window analysis on
> this stream, but I would suspect that it's a problem on the client.
Could be, though it sounds like all you changed here was replacing the
NetBSD server by a Linux server?
Of course, that's a rather complicated change in itself (default NFS
version, transport (tcp vs udp), etc. may have changed as well.
Might be worth fooling with those parameters using mount options. The
defaults should be best, but it might help narrow down the problem.
--b.
>
> NFS server:
> herring-[~] mcr 1001 %uname -a
> Linux herring 3.2.0-39-generic #62-Ubuntu SMP Thu Feb 28 00:28:53 UTC 2013
> x86_64 x86_64 x86_64 GNU/Linux
>
> NFS client:
> cassidy-[~] mcr 1010 %uname -a
> Linux cassidy.sandelman.ca 2.6.32-5-xen-686 #1 SMP Wed May 18 09:43:15 UTC
> 2011 i686 GNU/Linux
>
>
>
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: big send queues on NFS server
2013-06-18 17:25 ` J. Bruce Fields
@ 2013-06-18 19:32 ` mcr
0 siblings, 0 replies; 3+ messages in thread
From: mcr @ 2013-06-18 19:32 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs@vger.kernel.org
On Tue, Jun 18, 2013 at 09:48:31AM -0400, mcr@sandelman.ca wrote:
>> Hi, I have been an NFS user and enthusiast for 20+ years. My home
>> systems still have the numerical uid that doe.carleton.ca assigned me
>> back in 1989... cause of NFS... Recently, I turned off a NetBSD 5
...
>> If they decline in time, there is no interruption, otherwise, the web
>> server gets an underrun, and the music stops.
>>
>> I could also capture the entire NFS stream, or just do TCP window
>> analysis on this stream, but I would suspect that it's a problem on
>> the client.
J. Bruce Fields <bfields@fieldses.org> wrote:
jb> Could be, though it sounds like all you changed here was replacing
jb> the NetBSD server by a Linux server?
well, I mentioned NetBSD to indicate the length of time I have used various
NFS systems, not because I felt that it was a specific interop issue.
jb> Of course, that's a rather complicated change in itself (default NFS
jb> version, transport (tcp vs udp), etc. may have changed as well.
jb> Might be worth fooling with those parameters using mount options.
jb> The defaults should be best, but it might help narrow down the
jb> problem.
I am using mostly default options: nosuid, nodev, hard.
Generally, I have solved problems in the past by going back to NFSv3 on UDP
mounts, and then doing the classic nfsd worker tuning dance, and the
rsize=/wsize= game.
I am posting to understand if someone says, "oh, yes, you found issue 34534,
and it's a client side problem, and it's fixed in 3.7.2..."
or: "thats is weird. What does /proc/nfs/magic_client_side_tunnable say?"
or: "I have that too"
or: "can you send a pcap?"
I would love: "that's a client problem" vs "that's a server problem",
and I'd go investigate deeper there :-)
--
] Never tell me the odds! | ipv6 mesh networks [
] Michael Richardson, Sandelman Software Works | network architect [
] mcr@sandelman.ca http://www.sandelman.ca/ | ruby on rails [
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-06-18 19:33 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-18 13:48 big send queues on NFS server mcr
2013-06-18 17:25 ` J. Bruce Fields
2013-06-18 19:32 ` mcr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).