From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Avi Kivity <avi@exanet.com>
Cc: Pavel Machek <pavel@ucw.cz>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Deadlock during heavy write activity to userspace NFS server on local NFS mount
Date: Wed, 28 Jul 2004 11:29:39 +1000 [thread overview]
Message-ID: <41070183.5000701@yahoo.com.au> (raw)
In-Reply-To: <4106C2E8.905@exanet.com>
Avi Kivity wrote:
> Pavel Machek wrote:
>
>> I'd hope that kswapd was carefully to make sure that it always has
>> enough pages...
>>
>> ...it is harder to do the same auditing with userland program.
>>
>>
>>
> Very true. But is a kernel thread like kswapd depends on a userspace
> program, then that program better be well behaved.
>
>>> A more complete solution would be to assign memory reserve levels
>>> below which a process starts allocating synchronously. For example,
>>> normal processes must have >20MB to make forward progress, kswapd
>>> wants >15MB and the NFS server needs >10MB. Some way would be needed
>>> to express the dependencies.
>>>
>>
>>
>> Yes, something like that would be neccessary. I believe it would be
>> slightly more complicated, like
>>
>> "NFS server needs > 10MB *and working kswapd*", so you'd need 25MB in
>> fact... and this info should be stored in some readable form so that
>> it can be checked.
>>
>>
>>
> If the NFS server needed kswapd, we'd deadlock pretty soon, as kswapd
> *really* needs the NFS server. In our case, all block I/O is done using
> unbuffered I/O, and all memory is preallocated, so we don't need kswapd
> at all, just that small bit of memory that syscalls consume.
>
> If the NFS server really needs kswapd, then there'd better be two of
> them. Regular processes would depend on one kswapd, which depends on the
> NFS server, which depends on the second kswapd, which depends on the
> hardware alone. It should be fun trying to describe that topology to the
> kernel through some API.
>
> Our filesystem actually does something like that internally, except the
> dependency chain length is seven, not two.
>
There is some need arising for a call to set the PF_MEMALLOC flag for
userspace tasks, so you could probably get a patch accepted. Don't
call it KSWAPD_HELPER though, maybe MEMFREE or RECLAIM or RECLAIM_HELPER.
But why is your NFS server needed to reclaim memory? Do you have the
filesystem mounted locally?
next prev parent reply other threads:[~2004-07-28 1:31 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-07-26 13:11 [PATCH] Deadlock during heavy write activity to userspace NFS server on local NFS mount Avi Kivity
2004-07-26 21:02 ` Pavel Machek
2004-07-27 20:22 ` Avi Kivity
2004-07-27 20:34 ` Pavel Machek
2004-07-27 21:02 ` Avi Kivity
2004-07-28 1:29 ` Nick Piggin [this message]
2004-07-28 2:17 ` Trond Myklebust
2004-07-28 5:13 ` Avi Kivity
2004-07-28 5:11 ` Avi Kivity
2004-07-28 5:29 ` Nick Piggin
2004-07-28 7:05 ` Avi Kivity
2004-07-28 7:16 ` Nick Piggin
2004-07-28 7:45 ` Avi Kivity
2004-07-28 9:05 ` Nick Piggin
2004-07-28 10:11 ` Avi Kivity
2004-07-28 10:30 ` Nick Piggin
2004-07-28 11:48 ` Avi Kivity
2004-07-29 8:29 ` Nick Piggin
2004-07-29 12:19 ` Marcelo Tosatti
2004-07-29 16:09 ` Avi Kivity
2004-07-28 12:08 ` Mikulas Patocka
2004-07-28 12:18 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41070183.5000701@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=avi@exanet.com \
--cc=linux-kernel@vger.kernel.org \
--cc=pavel@ucw.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox