Linux NFS development
 help / color / mirror / Atom feed
From: Christian Herzog <herzog@phys.ethz.ch>
To: Bob Ciotti <bob.ciotti@gmail.com>
Cc: Chuck Lever III <chuck.lever@oracle.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Bob Ciotti <bob.ciotti@nasa.gov>
Subject: Re: file server freezes with all nfsds stuck in D state after upgrade to Debian bookworm
Date: Thu, 6 Apr 2023 19:26:15 +0200	[thread overview]
Message-ID: <ZC8AtxuibSDwvK49@phys.ethz.ch> (raw)
In-Reply-To: <4F41FC87-908F-451F-8D2C-089CB7AB5919@gmail.com>

Dear Bob,

thanks a lot for your input.

> >>>> That was our first idea too, but we haven't found any indication that this is the case. The xfs file systems seem perfectly fine when all nfsds are in D state, and we can
> >>>> read from them and write to them. If xfs were to block nfs IO, this should
> >>>> affect other processes too, right?
> >>> It's possible that the NFSD threads are waiting on I/O to a particular filesystem block. XFS is not likely to block other activity in this case.
> >> ok good to know. So far we were under the impression that a file system would
> >> block as a whole.
> > 
> > XFS tries to operate in parallel as much as it can. Maybe other filesystems aren't as capable.
> > 
> > If the unresponsive block is part of a superblock or the journal (ie, shared metadata) I would expect XFS to become unresponsive. For I/O on blocks containing file data, it is likely to have more robust behavior.
> > 
> 
> Pretty sure we have seen a similar issue - never fully explained.  From what I recall, the server gets to a low memory state. At that point, efforts to coalesce writes are abandoned, and each write request is processed in line - vs scheduled - all nfsd's then pile up in D.  writes continue to arrive at a rate higher than can keep up. But, the back end store (a high end netapp raid 6 w/240 drives also with xfs) had very little load - not too busy.  Never fully explained it - but Chucks point on  shared metadata block may be good place to look - and whether in-line write at low memory could have synergy.  IIRC, worked around with releases and tunables like minfree kmem et.al. , that came into play to reduce - but not eliminate. I'm away from reference material for a while but I'll review and update if I find anything.
we'll certainly investigate this topic, but right now it's kinda hard to
imagine since I've never seen the file server above ~10G of its 64G of RAM
(excluding page cache of course). We're not even sure heavy writes trigger the
problem, in one case our monitoring hinted at a lot of reads leading up to the
freeze.
OTOH if our issue could be resolved by throwing a bunch of RAM bars into the
server, all the better.


thanks,
-Christian


      parent reply	other threads:[~2023-04-06 17:27 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-06 11:09 file server freezes with all nfsds stuck in D state after upgrade to Debian bookworm Christian Herzog
2023-04-06 13:48 ` Chuck Lever III
2023-04-06 15:33   ` Christian Herzog
2023-04-06 15:40     ` Chuck Lever III
2023-04-06 15:54       ` Christian Herzog
2023-04-06 16:19         ` Chuck Lever III
     [not found]           ` <4F41FC87-908F-451F-8D2C-089CB7AB5919@gmail.com>
2023-04-06 17:26             ` Christian Herzog [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZC8AtxuibSDwvK49@phys.ethz.ch \
    --to=herzog@phys.ethz.ch \
    --cc=bob.ciotti@gmail.com \
    --cc=bob.ciotti@nasa.gov \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox