From: Trond Myklebust <Trond.Myklebust@netapp.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
Matthew Wilcox <matthew@wil.cx>
Subject: Re: lost interrupt after a signal?
Date: Thu, 22 May 2008 16:39:43 -0400 [thread overview]
Message-ID: <1211488783.8361.8.camel@localhost> (raw)
In-Reply-To: <2A43EAAA-8AEC-4EA1-AAA6-1AE1C750DB4C@oracle.com>
On Thu, 2008-05-22 at 10:57 -0400, Chuck Lever wrote:
> We've been running some tests to understand how the 2.6.25 "intr/
> nointr" behavior affects signal handling during I/O on NFS mounts.
>
> While running an Oracle database workload, we signal the database
> (this is a normal way administrative tools control database
> activity). Subsequently all of the I/O threads block on the inode
> mutex in nfs_invalidate_mapping() except this one:
>
> INFO: task oracle:27214 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> message.
> oracle D f6d85e84 1592 27214 1
> c93d2920 00200086 00000001 f6d85e84 c04a0080 c04a0080 c04a0080
> c93d2b84
> c93d2b84 c4021f80 00000001 cc072000 f341c900 f6d85e7c 10a1a042
> f6d85e7c
> cc072ddc c4021f80 03b7e000 cc072ddc c40082b4 c036e21c cc072dd4
> 00000001
> Call Trace:
> [<c036e21c>] io_schedule+0x4c/0x90
> [<c015f63c>] sync_page+0x2c/0x40
> [<c036e3e5>] __wait_on_bit_lock+0x45/0x70
> [<c015f610>] sync_page+0x0/0x40
> [<c015f5f3>] __lock_page+0x73/0x80
> [<c013cad0>] wake_bit_function+0x0/0x80
> [<c0167f98>] invalidate_inode_pages2_range+0xb8/0x200
> [<f905d1a8>] nfs_writepages+0x68/0x90 [nfs]
> [<f905489f>] nfs_invalidate_mapping_nolock+0x1f/0xd0 [nfs]
> [<f9054ffa>] nfs_invalidate_mapping+0x5a/0x60 [nfs]
> [<f90538a5>] nfs_file_read+0x85/0x120 [nfs]
> [<c0182685>] do_sync_read+0xd5/0x120
> [<c016cf4a>] __do_fault+0x1ca/0x400
> [<c011c277>] __update_rq_clock+0x27/0x180
> [<c013ca80>] autoremove_wake_function+0x0/0x50
> [<c0136b25>] k_getrusage+0x1f5/0x200
> [<c01e525c>] security_file_permission+0xc/0x10
> [<c0182736>] rw_verify_area+0x66/0xd0
> [<c0136b52>] getrusage+0x22/0x40
> [<c0182f81>] vfs_read+0xa1/0x140
> [<c01825b0>] do_sync_read+0x0/0x120
> [<c01835da>] sys_pread64+0x6a/0x70
> [<c0103e62>] syscall_call+0x7/0xb
>
> I haven't looked too closely at this, but maybe the signal caused a
> lost I/O interrupt?
>
> What would be the next steps to troubleshoot this further?
'cat /proc/1592/status' should tell you if there is a signal that is
being blocked.
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
next prev parent reply other threads:[~2008-05-22 20:39 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-22 14:57 lost interrupt after a signal? Chuck Lever
2008-05-22 20:39 ` Trond Myklebust [this message]
2008-05-23 3:50 ` Matthew Wilcox
[not found] ` <20080523035004.GY2638-6jwH94ZQLHl74goWV3ctuw@public.gmane.org>
2008-05-27 15:59 ` Chuck Lever
2008-05-27 17:35 ` Matthew Wilcox
[not found] ` <20080527173530.GM30894-6jwH94ZQLHl74goWV3ctuw@public.gmane.org>
2008-12-09 22:52 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1211488783.8361.8.camel@localhost \
--to=trond.myklebust@netapp.com \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=matthew@wil.cx \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox