From: "Tuomas Räsänen" <tuomasjjrasanen@opinsys.fi>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: Veli-Matti Lintu <veli-matti.lintu@opinsys.fi>,
linux-nfs@vger.kernel.org
Subject: Re: Soft lockups on kerberised NFSv4.0 clients
Date: Mon, 2 Jun 2014 09:56:45 +0000 (UTC) [thread overview]
Message-ID: <1863565244.57055.1401703005363.JavaMail.zimbra@opinsys.fi> (raw)
In-Reply-To: <20140521165304.4331255d@tlielax.poochiereds.net>
----- Original Message -----
> From: "Jeff Layton" <jlayton@poochiereds.net>
>
.
.
.
> Ok, now that I look closer at your stack trace the problem appears to
> be that the unlock code is waiting for the lock context's io_count to
> drop to zero before allowing the unlock to proceed.
>
> That likely means that there is some outstanding I/O that isn't
> completing, but it's possible that the problem is the CB_RECALL is
> being ignored. This will probably require some analysis of wire captures.
>
> In your earlier mail, you mentioned that the client was responding to
> the CB_RECALL with NFS4ERR_BADHANDLE. Determining why that's happening
> may be the best place to focus your efforts.
>
> Now that I look, nfs4_callback_recall does this:
>
> res = htonl(NFS4ERR_BADHANDLE);
> inode = nfs_delegation_find_inode(cps->clp, &args->fh);
> if (inode == NULL)
> goto out;
>
> So it looks like it's not finding the delegation for some reason.
> You'll probably need to hunt down which open gave you the delegation in
> the first place and then sanity check the CB_RECALL request to
> determine whether it's the client or server that's insane here...
>
Speaking of insanity, I'll try to describe some of our findings in hope someone helps us to get a better grasp of the issue.
OPEN requests seem valid to me, there does not seem be any real difference between with OPENs granting RECALLable delegations and OPENs granting delegations which cause BADHANDLEs to be returned when RECALLed. I don't have any ideas what to look for.. probably been staring at capturelogs for too long...
BADHANDLE resposes to CB_RECALLs seem to be fairly common in our environment and there is not clear link between those and the softlockups described describer earlier by Veli-Matti. BADHANDLEs can happen multiple times before the first softlockup. After the first softlockup, the system keeps experiencing lockups (with various tracebacks) with an increasing speed, so I guess only the very first trace is meaningful. And the very first traceback seems to always be the traceback posted by Veli-Matti in his first email.
The BADHANDLE situation is also quite volatile: if nfs_delegation_find_inode() is called again, a bit later, before returning from nfs4_callback_recall(), it returns a valid inode instead of NULL. What does this indicate? Somehow related to the nature of RCU?
--
Tuomas
next prev parent reply other threads:[~2014-06-02 10:01 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <199810131.34257.1400570367382.JavaMail.zimbra@opinsys.fi>
2014-05-20 8:40 ` Soft lockups on kerberised NFSv4.0 clients Veli-Matti Lintu
2014-05-20 14:21 ` Jeff Layton
2014-05-21 14:55 ` Veli-Matti Lintu
2014-05-21 20:53 ` Jeff Layton
2014-06-02 9:56 ` Tuomas Räsänen [this message]
2014-06-02 19:10 ` Veli-Matti Lintu
2014-06-09 10:11 ` Tuomas Räsänen
2014-06-17 13:51 ` Tuomas Räsänen
2014-09-03 7:01 ` Tuomas Räsänen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1863565244.57055.1401703005363.JavaMail.zimbra@opinsys.fi \
--to=tuomasjjrasanen@opinsys.fi \
--cc=jlayton@poochiereds.net \
--cc=linux-nfs@vger.kernel.org \
--cc=veli-matti.lintu@opinsys.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).