Re: Locking problems with Linux 4.9 and 4.11 with NFSD and `fs/iomap.c`

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Paul Menzel <pmenzel@molgen.mpg.de>
To: Dave Chinner <david@fromorbit.com>
Cc: it+linux-nfs@molgen.mpg.de, Brian Foster <bfoster@redhat.com>,
	Christoph Hellwig <hch@lst.de>it+linux-nfs@molgen.mpg.de,
	linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Jeff Layton <jlayton@poochiereds.net>
Subject: Re: Locking problems with Linux 4.9 and 4.11 with NFSD and `fs/iomap.c`
Date: Thu, 10 Aug 2017 16:11:34 +0200	[thread overview]
Message-ID: <92d0933c-f031-f4e7-191e-eb3c9b1260aa@molgen.mpg.de> (raw)
In-Reply-To: <20170801225144.GP17762@dastard>

Dear Dave,


On 08/02/17 00:51, Dave Chinner wrote:
> On Tue, Aug 01, 2017 at 07:49:50PM +0200, Paul Menzel wrote:

>> On 06/27/17 13:59, Paul Menzel wrote:
>>
>>> Just a small update that we were hit by the problem on a different
>>> machine (identical model) with Linux 4.9.32 and the exact same
>>> symptoms.
>>>
>>> ```
>>> $ sudo cat /proc/2085/stack
>>> [<ffffffff811f920c>] iomap_write_begin+0x8c/0x120
>>> [<ffffffff811f982b>] iomap_zero_range_actor+0xeb/0x210
>>> [<ffffffff811f9a82>] iomap_apply+0xa2/0x110
>>> [<ffffffff811f9c58>] iomap_zero_range+0x58/0x80
>>> [<ffffffff8133c7de>] xfs_zero_eof+0x4e/0xb0
>>> [<ffffffff8133c9dd>] xfs_file_aio_write_checks+0x19d/0x1c0
>>> [<ffffffff8133ce89>] xfs_file_buffered_aio_write+0x79/0x2d0
>>> [<ffffffff8133d17e>] xfs_file_write_iter+0x9e/0x150
>>> [<ffffffff81198dc0>] do_iter_readv_writev+0xa0/0xf0
>>> [<ffffffff81199fba>] do_readv_writev+0x18a/0x230
>>> [<ffffffff8119a2ac>] vfs_writev+0x3c/0x50
>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>> ```
>>>
>>> We haven’t had time to set up a test system yet to analyze that further.
>>
>> Today, two systems with Linux 4.9.23 exhibited the problem of `top`
>> showing that `nfsd` is at 100 %. Restarting one machine into Linux
>> *4.9.38* showed the same problem. One of them with a 1 GBit/s
>> network device got traffic from a 10 GBit/s system, so the
>> connection was saturated.
> 
> So the question is this: is there IO being issued here, is the page
> cache growing, or is it in a tight loop doing nothing? Details of
> your hardware, XFS config and NFS server config is kinda important
> here, too.

Could you please guide me, where I can get the information you request?

The hardware ranges from slow 12 thread systems with 96 GB RAM to 80 
thread 1 TB RAM machines. Often big files (up to 100 GB) are written.

> For example, if the NFS server IO patterns trigger a large
> speculative delayed allocation, then the client does a write at the
> end of the speculative delalloc range, we will zero the entire
> speculative delalloc range. That could be several GB of zeros that
> need to be written here. It's sub-optimal, yes, and but large
> zeroing is rare enough that we haven't needed to optimise it by
> allocating unwritten extents instead.  It would be really handy to
> know what application the NFS client is running as that might give
> insight into the trigger behaviour and whether you are hitting this
> case.

It ranges from simple `cp` to scripts writing FASTQ files with 
biological sequences in it.

> Also, if the NFS client is only writing to one file, then all the
> other writes that are on the wire will end up being serviced by nfsd
> threads that then block waiting for the inode lock. If the client
> issues more writes on the wire than the NFS server has worker
> threads, the client side write will starve the NFS server of
> worker threads until the zeroing completes. This is the behaviour
> you are seeing - it's a common server side config error that's been
> known for at least 15 years...
> 
> FWIW, it used to be that a linux NFS client could have 16 concurrent
> outstanding NFS RPCs to a server at a time - I don't know if that
> limit still exists or whether it's been increased. However, the
> typical knfsd default is (still) only 8 worker threads, meaning a
> single client and server using default configs can cause the above
> server DOS issue. e.g on a bleeding edge debian distro install:
> 
> $ head -2 /etc/default/nfs-kernel-server
> # Number of servers to start up
> RPCNFSDCOUNT=8
> $
> 
> So, yeah, distros still only configure the nfs server with 8 worker
> thread by default. If it's a dedicated NFS server, then I'd be using
> somewhere around 64 NFSD threads *per CPU* as a starting point for
> the server config...
> 
> At minimum, you need to ensure that the NFS server has at least
> double the number of server threads as the largest client side
> concurrent RPC count so that a single client can't DOS the NFS
> server with a single blocked write stream.

That’s not the issue here. It’s started with 64 threads here. Also this 
doesn’t explain, why it works with the 4.4 series.

The directory cannot be accessed at all. `ls /mounted/path` just hangs 
on remote systems.


Kind regards,

Paul

next prev parent reply	other threads:[~2017-08-10 14:11 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-07 19:09 Locking problems with Linux 4.9 with NFSD and `fs/iomap.c` Paul Menzel
2017-05-08 13:18 ` Brian Foster
2017-05-09  9:05   ` Christoph Hellwig
     [not found]     ` <7ae18b0d-38e3-9b12-0989-ede68956ad43@molgen.mpg.de>
     [not found]       ` <358037e8-6784-ebca-9fbb-ec7eef3977d6@molgen.mpg.de>
     [not found]         ` <20170510171757.GA10534@localhost.localdomain>
2017-06-27 11:59           ` Locking problems with Linux 4.9 and 4.11 " Paul Menzel
2017-06-28 16:41             ` Christoph Hellwig
2017-08-01 17:49             ` Paul Menzel
2017-08-01 22:51               ` Dave Chinner
2017-08-10 14:11                 ` Paul Menzel [this message]
2017-08-10 19:54                   ` AW: " Markus Stockhausen
2017-08-11 10:15                     ` Christoph Hellwig
2017-08-11 15:14                       ` Paul Menzel
2017-05-10  9:08   ` Locking problems with Linux 4.9 " Paul Menzel
2017-05-10 17:23     ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=92d0933c-f031-f4e7-191e-eb3c9b1260aa@molgen.mpg.de \
    --to=pmenzel@molgen.mpg.de \
    --cc=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=it+linux-nfs@molgen.mpg.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).