All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <matthew@wil.cx>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: lost interrupt after a signal?
Date: Tue, 27 May 2008 11:35:31 -0600	[thread overview]
Message-ID: <20080527173530.GM30894@parisc-linux.org> (raw)
In-Reply-To: <D56CE258-FB78-4449-A1D8-376BB3D93387@oracle.com>

On Tue, May 27, 2008 at 11:59:00AM -0400, Chuck Lever wrote:
> >This isn't jumping out screaming that it's my fault (obviously it
> >probably is, but ...).  invalidate_inode_pages2_range calls  
> >lock_page()
> >... which uses TASK_UNINTERRUPTIBLE.  If it were calling
> >lock_page_killable(), I'd understand.
> 
> I don't think it's directly caused by your changes, but my concern is  
> that you may have exposed a latent bug, or exposed an underlying  
> design assumption in the NFS/RPC client stack that causes the hang in  
> this situation.

Certainly possible.

> >Maybe this isn't the problem task though.  Maybe this is just the
> >canary that dropped dead, and we should stop trying to autopsy it and
> >start running.  [ok, I'll stop with the bad analogies now]
> 
> This appears to be the only task that is in this state.  All the  
> others in the dump are waiting for this inode's mutex.  I don't know  
> if the dump is complete, though.

My thought is that the task which caused the problem has gone away and
left this page in a state where sync_page will never finish.

> I've passed your suggestions along to our testers.

Thanks!  I'm keen to get this fixed.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

  reply	other threads:[~2008-05-27 17:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-22 14:57 lost interrupt after a signal? Chuck Lever
2008-05-22 20:39 ` Trond Myklebust
2008-05-23  3:50 ` Matthew Wilcox
     [not found]   ` <20080523035004.GY2638-6jwH94ZQLHl74goWV3ctuw@public.gmane.org>
2008-05-27 15:59     ` Chuck Lever
2008-05-27 17:35       ` Matthew Wilcox [this message]
     [not found]         ` <20080527173530.GM30894-6jwH94ZQLHl74goWV3ctuw@public.gmane.org>
2008-12-09 22:52           ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080527173530.GM30894@parisc-linux.org \
    --to=matthew@wil.cx \
    --cc=Trond.Myklebust@netapp.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.