From: Avi Kivity <avi@exanet.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Pavel Machek <pavel@ucw.cz>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Deadlock during heavy write activity to userspace NFS server on local NFS mount
Date: Thu, 29 Jul 2004 19:09:21 +0300 [thread overview]
Message-ID: <41092131.2060200@exanet.com> (raw)
In-Reply-To: <4108B558.2050905@yahoo.com.au>
Nick Piggin wrote:
> Avi Kivity wrote:
>
>> Nick Piggin wrote:
>>
>>> Avi Kivity wrote:
>>>
>>>> Nick Piggin wrote:
>>>
>>>
>>>
>>>>>> What's stopping the NFS server from ooming the machine then?
>>>>>> Every time some bit of memory becomes free, the server will
>>>>>> consume it instantly. Eventually ext3 will not be able to write
>>>>>> anything out because it is out of memory.
>>>>>>
>>>>> The NFS server should do the writeout a page at a time.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> The NFS server writes not only in response to page reclaim (as a
>>>> local NFS client), but also in response to pressure from non-local
>>>> clients. If both ext3 and NFS have the same allocation limits, NFS
>>>> may starve out ext3.
>>>>
>>>
>>> What do you mean starve out ext3? ext3 gets written to *by the NFS
>>> server*
>>> which is PF_MEMALLOC.
>>
>>
>>
>> When the NFS server writes, it allocates pagecache and temporary
>> objects. When ext3 writes, it allocates temporary objects. If the NFS
>> server writes too much, ext3 can't allocate memory, and will never be
>> able to allocate memory.
>>
>
> That is because your NFS server shouldn't hog as much memory as
> it likes when it is PF_MEMALLOC. The entire writeout path should
> do a page at a time if it is PF_MEMALLOC. Ie, the server should
> be doing write, fsync.
We attempted to use sync local mounts (not what you are suggesting: on
the NFS client side, without the PF_MEMALLOC hack) and still got the
same deadlock. I am unable to explain why.
>
>
> But now that I think about it, I guess you may not be able to
> distinguish that from regular writeout, so doing a page at a time
> would hurt performance too much.
>
> Hmm so I guess the idea of a per task reserve limit may be the way
> to do it, yes. Thanks for bearing with me!
It was my pleasure.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
next prev parent reply other threads:[~2004-07-29 16:14 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-07-26 13:11 [PATCH] Deadlock during heavy write activity to userspace NFS server on local NFS mount Avi Kivity
2004-07-26 21:02 ` Pavel Machek
2004-07-27 20:22 ` Avi Kivity
2004-07-27 20:34 ` Pavel Machek
2004-07-27 21:02 ` Avi Kivity
2004-07-28 1:29 ` Nick Piggin
2004-07-28 2:17 ` Trond Myklebust
2004-07-28 5:13 ` Avi Kivity
2004-07-28 5:11 ` Avi Kivity
2004-07-28 5:29 ` Nick Piggin
2004-07-28 7:05 ` Avi Kivity
2004-07-28 7:16 ` Nick Piggin
2004-07-28 7:45 ` Avi Kivity
2004-07-28 9:05 ` Nick Piggin
2004-07-28 10:11 ` Avi Kivity
2004-07-28 10:30 ` Nick Piggin
2004-07-28 11:48 ` Avi Kivity
2004-07-29 8:29 ` Nick Piggin
2004-07-29 12:19 ` Marcelo Tosatti
2004-07-29 16:09 ` Avi Kivity [this message]
2004-07-28 12:08 ` Mikulas Patocka
2004-07-28 12:18 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41092131.2060200@exanet.com \
--to=avi@exanet.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=pavel@ucw.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox