From: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
To: <linux-nfs@vger.kernel.org>
Subject: NFS invalid refcount warnings
Date: Wed, 22 Mar 2017 15:37:58 +0100 [thread overview]
Message-ID: <95b51cbe-438c-e4a6-4a3f-37bb24fbcf2c@imgtec.com> (raw)
Hi,
I'm trying to debug an issue I'm seeing on my test machine that occurs
quite reliably, although I'm unfortunately unable to descibe any
specific steps to reproduce the issue.
The system is running kernel 4.10.4
The rootfs is on an NFS share mounted with the following opts:
<***> on / type nfs
(rw,relatime,vers=3,rsize=4096,wsize=4096,namlen=255,hard,nolock,
proto=udp,timeo=10,retrans=3,sec=sys,mountaddr=<***>,
mountvers=3,mountproto=udp,local_lock=all,addr=<***>)
The system running linux is an FPGA so it is relatively slow and it
performs various stability tests running a lot of applications in
parallel, which makes it particularly slow due to heavy load ;)
It usually takes 30 to 60 minutes for the following error to occur:
warning in nfs_scan_commit_list::kref_get()
[ 3671.685359] [<80453ae4>] nfs_scan_commit_list+0x228/0x248
[ 3671.685359] [<80453ba0>] nfs_scan_commit+0x9c/0x118
[ 3671.685359] [<80453ef8>] nfs_commit_inode+0xf8/0x17c
[ 3671.752838] [<80454300>] nfs_wb_all+0x140/0x278
[ 3671.752838] [<80443390>] nfs_setattr+0x364/0x47c
[ 3671.752838] [<8032ae58>] notify_change+0x1c0/0x4c4
[ 3671.752838] [<80349ab0>] utimes_common+0xc8/0x194
[ 3671.752838] [<80349cd8>] do_utimes+0x15c/0x188
[ 3671.752838] [<80349e9c>] SyS_utimensat+0xa8/0xf8
[ 3671.752838] [<8011a5d8>] syscall_common+0x34/0x58
After the first error, there are usually more that follow, sometimes
with the same call stack, sometimes different, eg.
[ 3674.001118] [<80453ae4>] nfs_scan_commit_list+0x228/0x248
[ 3674.001118] [<80453ba0>] nfs_scan_commit+0x9c/0x118
[ 3674.001118] [<80453ef8>] nfs_commit_inode+0xf8/0x17c
[ 3674.001118] [<80454198>] nfs_write_inode+0xa4/0xcc
[ 3674.001118] [<80342da4>] __writeback_single_inode+0x360/0x6e0
[ 3674.001118] [<80343934>] writeback_sb_inodes+0x2b8/0x514
[ 3674.001118] [<80343c50>] __writeback_inodes_wb+0xc0/0x114
[ 3674.001118] [<80343fd4>] wb_writeback+0x330/0x494
[ 3674.001118] [<80344eb0>] wb_workfn+0x2cc/0x77c
[ 3674.001118] [<80179154>] process_one_work+0x20c/0x69c
[ 3674.001118] [<80179760>] worker_thread+0x17c/0x530
[ 3674.001118] [<8018077c>] kthread+0x164/0x194
[ 3674.001118] [<80105dd4>] ret_from_kernel_thread+0x14/0x1c
A few of those warnings are usually followed by a linked-list debug
warnings or dereferences of NULL pointers in nfs_inode_remove_request
(req->wb_context is null)
I'd appreciate any help with debugging this issue, as I'm struggling to
get a better understanding of what may be happening (obviously this
looks like it might be caused by incorrect locking somewhere, but as I'm
not familiar with the nfs code it's not easy to understand how it works,
especially given its async structure)
thanks,
Marcin
reply other threads:[~2017-03-22 14:38 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=95b51cbe-438c-e4a6-4a3f-37bb24fbcf2c@imgtec.com \
--to=marcin.nowakowski@imgtec.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).