public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Avi Kivity <avi@exanet.com>
Cc: Pavel Machek <pavel@ucw.cz>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Deadlock during heavy write activity to userspace NFS server on local NFS mount
Date: Wed, 28 Jul 2004 19:05:48 +1000	[thread overview]
Message-ID: <41076C6C.2010401@yahoo.com.au> (raw)
In-Reply-To: <41075986.8020401@exanet.com>

Avi Kivity wrote:
> Nick Piggin wrote:
> 
>> Avi Kivity wrote:
>>
>>> Nick Piggin wrote:
>>
>>
>>
>>>>
>>>> The solution is that PF_MEMALLOC tasks are allowed to access the 
>>>> reserve
>>>> pool. Dependencies don't matter to this system. It would be your job to
>>>> ensure all tasks that might need to allocate memory in order to free
>>>> memory have the flag set.
>>>
>>>
>>>
>>>
>>> In the general case that's not sufficient. What if the NFS server 
>>> wrote to ext3 via the VFS? We might have a ton of ext3 pagecache 
>>> waiting for kswapd to reclaim NFS memory, while kswapd is waiting on 
>>> the NFS server writing to ext3.
>>>
>>
>> It is sufficient.
>>
>> You didn't explain your example very well, but I'll assume it is the
>> following:
>>
>> dirty NFS data -> NFS server on localhost -> ext3 filesystem. 
> 
> 
> That's what I meant, sorry for not making it clear.
> 
>>
>>
>> So kswapd tries to reclaim some memory and writes out the dirty NFS
>> data. The NFS server then writes this data to ext3 (it can do this
>> because it is PF_MEMALLOC). The data gets written out, the NFS server
>> tells the client it is clean, kswapd continues.
>>
>> Right?
> 
> 
> What's stopping the NFS server from ooming the machine then? Every time 
> some bit of memory becomes free, the server will consume it instantly. 
> Eventually ext3 will not be able to write anything out because it is out 
> of memory.
> 

The NFS server should do the writeout a page at a time.

> An even more complex case is when ext3 depends on some other process, 
> say it is mounted on a loopback nbd.
> 
>  dirty NFS data -> NFS server -> ext3 -> nbd -> nbd server on localhost 
> -> ext3/raw device
> 
> You can't have both the NFS server and the nbd server PF_MEMALLOC, since 
> the NFS server may consume all memory, then wait for the nbd server to 
> reclaim.
> 

The memory allocators will block when memory reaches the reserved
mark. Page reclaim will ask NFS to free one page, so the server
will write something out to the filesystem, this will cause the nbd
server (also PF_MEMALLOC) to write out to its backing filesystem.

> The solution I have in mind is to replace the sync allocation logic from
> 
>    if (free_mem() < some_global_limit && !current->PF_MEMALLOC)
>        wait_for_kswapd()
> 
> to
> 
>    if (free_mem() < current->limit)
>        wait_for_kswapd()
> 
> kswapd would have the lowest ->limit, other processes as their place in 
> the food chain dictates. 

I think this is barking up the wrong tree. It really doesn't matter
what process is freeing memory. There isn't really anything special
about the way kswapd frees memory.

  reply	other threads:[~2004-07-28  9:09 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-26 13:11 [PATCH] Deadlock during heavy write activity to userspace NFS server on local NFS mount Avi Kivity
2004-07-26 21:02 ` Pavel Machek
2004-07-27 20:22   ` Avi Kivity
2004-07-27 20:34     ` Pavel Machek
2004-07-27 21:02       ` Avi Kivity
2004-07-28  1:29         ` Nick Piggin
2004-07-28  2:17           ` Trond Myklebust
2004-07-28  5:13             ` Avi Kivity
2004-07-28  5:11           ` Avi Kivity
2004-07-28  5:29             ` Nick Piggin
2004-07-28  7:05               ` Avi Kivity
2004-07-28  7:16                 ` Nick Piggin
2004-07-28  7:45                   ` Avi Kivity
2004-07-28  9:05                     ` Nick Piggin [this message]
2004-07-28 10:11                       ` Avi Kivity
2004-07-28 10:30                         ` Nick Piggin
2004-07-28 11:48                           ` Avi Kivity
2004-07-29  8:29                             ` Nick Piggin
2004-07-29 12:19                               ` Marcelo Tosatti
2004-07-29 16:09                               ` Avi Kivity
2004-07-28 12:08       ` Mikulas Patocka
2004-07-28 12:18         ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41076C6C.2010401@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=avi@exanet.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pavel@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox