From: Menyhart Zoltan <Zoltan.Menyhart@bull.net>
To: linux-nfs@vger.kernel.org
Subject: NFS v4 client blocks when using UDP protocol
Date: Tue, 29 Mar 2011 16:32:25 +0200 [thread overview]
Message-ID: <4D91ED79.4020302@bull.net> (raw)
In-Reply-To: <EB0F26AF-C0DB-4861-ABB5-72E39CB38090@bull.net>
Hi,
NFS v4 client blocks under heavy load, when using UDP protocol.
No progress can be seen, the test program "iozone" cannot be killed.
"nfsiostat 10 -p" reports 2 "nfs_writepages()" calls on every 10 seconds:
server:/ mounted on /mnt/test:
op/s rpc bklog
0.00 0.00
read: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms)
0.000 0.000 0.000 0 (0.0%) 0.000 0.000
write: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms)
0.000 0.000 0.000 0 (0.0%) 0.000 0.000
0 nfs_readpage() calls read 0 pages
0 nfs_readpages() calls read 0 pages
0 nfs_updatepage() calls
0 nfs_writepage() calls wrote 0 pages
2 nfs_writepages() calls wrote 0 pages (0.0 pages per call)
Checking the test program with the "crash" reports:
PID: 6185 TASK: ffff8804047020c0 CPU: 0 COMMAND: "iozone"
#0 [ffff88036ce638b8] schedule at ffffffff814d9459
#1 [ffff88036ce63980] schedule_timeout at ffffffff814da152
#2 [ffff88036ce63a30] io_schedule_timeout at ffffffff814d8f2f
#3 [ffff88036ce63a60] balance_dirty_pages at ffffffff811213d1
#4 [ffff88036ce63b80] balance_dirty_pages_ratelimited_nr at ffffffff811217e4
#5 [ffff88036ce63b90] generic_file_buffered_write at ffffffff8110d993
#6 [ffff88036ce63c60] __generic_file_aio_write at ffffffff8110f230
#7 [ffff88036ce63d20] generic_file_aio_write at ffffffff8110f4cf
#8 [ffff88036ce63d70] nfs_file_write at ffffffffa05e4c2e
#9 [ffff88036ce63dc0] do_sync_write at ffffffff81170bda
#10 [ffff88036ce63ef0] vfs_write at ffffffff81170ed8
#11 [ffff88036ce63f30] sys_write at ffffffff81171911
#12 [ffff88036ce63f80] system_call_fastpath at ffffffff8100b172
RIP: 0000003709a0e490 RSP: 00007fffba8eac78 RFLAGS: 00000246
RAX: 0000000000000001 RBX: ffffffff8100b172 RCX: 0000003709a0ec10
RDX: 0000000000400000 RSI: 00007fe469600000 RDI: 0000000000000003
RBP: 0000000000000298 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe469600000
R13: 0000000000eb4c20 R14: 0000000000400000 R15: 0000000000eb4c20
ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
No activity can be seen with wireshark on the network.
Rebooting the client and restarting the test can show another failing point:
PID: 4723 TASK: ffff8803fc235580 CPU: 2 COMMAND: "iozone"
#0 [ffff880417e2dc08] schedule at ffffffff814d9459
#1 [ffff880417e2dcd0] io_schedule at ffffffff814d9c43
#2 [ffff880417e2dcf0] sync_page at ffffffff8110d01d
#3 [ffff880417e2dd00] __wait_on_bit at ffffffff814da4af
#4 [ffff880417e2dd50] wait_on_page_bit at ffffffff8110d1d3
#5 [ffff880417e2ddb0] wait_on_page_writeback_range at ffffffff8110d5eb
#6 [ffff880417e2deb0] filemap_write_and_wait_range at ffffffff8110d7b8
#7 [ffff880417e2dee0] vfs_fsync_range at ffffffff8119f11e
#8 [ffff880417e2df30] vfs_fsync at ffffffff8119f1ed
#9 [ffff880417e2df40] do_fsync at ffffffff8119f22e
#10 [ffff880417e2df70] sys_fsync at ffffffff8119f280
#11 [ffff880417e2df80] system_call_fastpath at ffffffff8100b172
RIP: 0000003709a0ebb0 RSP: 00007fff5517dc28 RFLAGS: 00000212
RAX: 000000000000004a RBX: ffffffff8100b172 RCX: 00000000000000ef
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000008000 R8: 00000000ffffffff R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: ffffffff8119f280
R13: ffff880417e2df78 R14: 00000000011d7c30 R15: 0000000000000000
ORIG_RAX: 000000000000004a CS: 0033 SS: 002b
Sometimes console messages appear, like:
INFO: task iozone:4723 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
iozone D 0000000000000002 0 4723 4683 0x00000080
ffff880417e2dcc8 0000000000000086 ffff880417e2dc48 ffffffffa042fc99
ffff88036e53d440 ffff8803e386aac0 ffff880402cf4610 ffff88036e53d448
ffff8803fc235b38 ffff880417e2dfd8 000000000000f558 ffff8803fc235b38
Call Trace:
[<ffffffffa042fc99>] ? rpc_run_task+0xd9/0x130 [sunrpc]
[<ffffffff81098aa9>] ? ktime_get_ts+0xa9/0xe0
[<ffffffff8110cfe0>] ? sync_page+0x0/0x50
[<ffffffff814d9c43>] io_schedule+0x73/0xc0
[<ffffffff8110d01d>] sync_page+0x3d/0x50
[<ffffffff814da4af>] __wait_on_bit+0x5f/0x90
[<ffffffff8110d1d3>] wait_on_page_bit+0x73/0x80
[<ffffffff8108df40>] ? wake_bit_function+0x0/0x50
[<ffffffff81122f25>] ? pagevec_lookup_tag+0x25/0x40
[<ffffffff8110d5eb>] wait_on_page_writeback_range+0xfb/0x190
[<ffffffff8110d7b8>] filemap_write_and_wait_range+0x78/0x90
[<ffffffff8119f11e>] vfs_fsync_range+0x7e/0xe0
[<ffffffff8119f1ed>] vfs_fsync+0x1d/0x20
[<ffffffff8119f22e>] do_fsync+0x3e/0x60
[<ffffffff8119f280>] sys_fsync+0x10/0x20
[<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
This problem can be systematically reproduced with the following test:
iozone -a -c -e -E -L 64 -m -S 4096 -f /mnt/test/file -g 20G -n 1M -y 4K
It can be reproduced with both type of networks we have, see below.
Switching on RPC debugging with "sysctl -w sunrpc.rpc_debug=65535" gives:
-pid- flgs status -client- --rqstp- -timeout ---ops--
28139 0001 -11 ffff88040f403400 (null) 0 ffffffffa06157e0 nfsv4 WRITE a:call_reserveresult q:xprt_backlog
29167 0001 -11 ffff88040f403400 ffff8803ff405130 0 ffffffffa06157e0 nfsv4 WRITE a:call_status q:xprt_resend
29172 0001 -11 ffff88040f403400 (null) 0 ffffffffa06157e0 nfsv4 WRITE a:call_reserveresult q:xprt_backlog
...
51364 0001 -11 ffff88040f403400 (null) 0 ffffffffa06157e0 nfsv4 WRITE a:call_reserveresult q:xprt_backlog
55355 0001 -11 ffff88040f403400 ffff8803ff405450 0 ffffffffa06157e0 nfsv4 WRITE a:call_status q:xprt_resend
55356 0001 -11 ffff88040f403400 ffff8803ff404000 0 ffffffffa06157e0 nfsv4 WRITE a:call_status q:xprt_resend
55357 0001 -11 ffff88040f403400 ffff8803ff404c80 0 ffffffffa06157e0 nfsv4 WRITE a:call_status q:xprt_resend
55369 0001 -11 ffff88040f403400 ffff8803ff4055e0 0 ffffffffa06157e0 nfsv4 WRITE a:call_status q:xprt_resend
...
55370 0001 -11 ffff88040f403400 (null) 0 ffffffffa06157e0 nfsv4 WRITE a:call_reserveresult q:xprt_backlog
55371 0001 -11 ffff88040f403400 (null) 0 ffffffffa06157e0 nfsv4 WRITE a:call_reserveresult q:xprt_backlog
...
No error messages in "dmesg" neither in "/var/log/messages" at the server side.
Our test HW:
2 * Supermicro R460, 2 dual core Xeons 5150 2.66GHz in each machine
InfiniBand Mellanox Technologies MT26418 in DDR mode, IP over IB
Intel Corporation 80003ES2LAN Gigabit Ethernet Controller, point to point direct connection
SW:
Kernels up to 2.6.38-rc8
Server side:
A SW RAID "/dev/md0" mounted on "/md0" is exported:
/dev/md0:
Version : 0.90
Creation Time : Wed Feb 18 17:18:34 2009
Raid Level : raid0
Array Size : 357423360 (340.87 GiB 366.00 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Wed Feb 18 17:18:34 2009
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Chunk Size : 256K
UUID : 05fae940:c7540b1e:d0f324d8:8f753e72
Events : 0.1
Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
1 8 32 1 active sync /dev/sdc
2 8 48 2 active sync /dev/sdd
3 8 64 3 active sync /dev/sde
4 8 80 4 active sync /dev/sdf
/etc/exports:
/md0 *(rw,insecure,no_subtree_check,sync,fsid=0)
The client machine mounts it as follows:
mount -o async,udp 192.168.1.79:/ /mnt/test # Ethernet
or
mount -o async,udp 192.168.42.79:/ /mnt/test # IP over IB
Have you already seen this problem?
Thank you in advance,
Zoltan Menyhart
next parent reply other threads:[~2011-03-29 15:02 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <EB0F26AF-C0DB-4861-ABB5-72E39CB38090@bull.net>
2011-03-29 14:32 ` Menyhart Zoltan [this message]
2011-03-29 15:06 ` NFS v4 client blocks when using UDP protocol Chuck Lever
2011-03-29 15:26 ` Menyhart Zoltan
2011-03-29 15:46 ` Chuck Lever
2011-03-29 15:50 ` peter.staubach
2011-03-30 0:12 ` Tom Haynes
2011-03-30 9:00 ` Trond Myklebust
2011-11-10 8:11 fanchaoting
2011-11-10 14:50 ` Myklebust, Trond
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D91ED79.4020302@bull.net \
--to=zoltan.menyhart@bull.net \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.