linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@argo.co.il>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: alan@lxorguk.ukuu.org.uk, torvalds@osdl.org, hbryan@us.ibm.com,
	akpm@osdl.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, pavel@ucw.cz
Subject: Re: [PATCH] [Request for inclusion] Filesystem in Userspace
Date: Tue, 30 Nov 2004 23:37:26 +0200	[thread overview]
Message-ID: <41ACE816.50104@argo.co.il> (raw)
In-Reply-To: <E1CZFJP-0004uZ-00@dorka.pomaz.szeredi.hu>

Miklos Szeredi wrote:

>>you're describing the deadlock here: all memory is full, no process 
>>which allocates memory can make any progress.
>>    
>>
>
>Yes they, can: the allocation will fail, function will return -ENOMEM,
>malloc will return NULL, pagefault will fail with OOM.  This is
>progress, though not the best sort.  It is most certainly _not_ a
>deadlock.
>
>  
>
Looks like we are in a deadlock here :)

However you choose to call it, it is unacceptable IMO.

>>This is not a true oom situation: there can be plenty of memory in
>>dirty pagecache which we could reclaim if we had that tiny bit of
>>reserve memory.
>>    
>>
>
>The amount of reserved memory that would be needed depends upon the
>filesystem.  Some filesystems would need only very little to be able
>to free some memory, some would need a lot (e.g. a bzip2 compressing
>filesystem).  There's no magic solution with reserving memory.
>  
>
So the userspace filesystem would pass that amount to the kernel. It's 
not pretty, but it is workable.

>And this is not unique to userspace filesystems, as Rik van Riel
>pointed out earlier, network filesystems are also prone to deadlock:
>
>http://lkml.org/lkml/2004/11/27/81
>
>  
>
This looks like a bug to me. Maybe jiggling the thresholds would help.

>>>I looked at ramfs, it isn't even limited.  You can easily crash your
>>>system just by filling it up with data, but no deadlock will happen.
>>> 
>>>
>>>      
>>>
>>Right. But ramfs doesn't call a userspace process which calls the kernel 
>>right back.
>>    
>>
>
>Doesn't matter one little whit.  The end result is the same: Out Of
>Memory, which is _not_ equivalent to deadlock.  Please think it over.
>  
>
The situation with userspace filesystems is:

  some process allocates memory, blocking on kswapd as memory is full
  kswapd calls userspace filesystem to free memory
  userspace filesystem calls kernel, which allocates memory and blocks 
on kswapd
  eventually all processes in the system block on kswapd

I have observed (and fixed) this on a real system.

with ramfs, once it accounts for memory, there would be no deadlock and 
no oom.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


  reply	other threads:[~2004-11-30 21:37 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-15 21:15 [PATCH] [Request for inclusion] Filesystem in Userspace Miklos Szeredi
2004-11-15 21:43 ` Greg KH
2004-11-15 22:35 ` Linus Torvalds
2004-11-16  9:08   ` Miklos Szeredi
2004-11-16  9:18     ` Arjan van de Ven
2004-11-16  9:40       ` Miklos Szeredi
2004-11-16  9:46         ` Arjan van de Ven
2004-11-16  9:52           ` Miklos Szeredi
2004-11-16 10:17             ` David Woodhouse
2004-11-16 10:25               ` Miklos Szeredi
2004-11-16 10:13     ` Pekka Enberg
2004-11-16 10:20       ` Miklos Szeredi
2004-11-16 10:35         ` Pekka Enberg
2004-11-16 10:42           ` Miklos Szeredi
2004-11-16 12:19             ` Pekka Enberg
2004-11-16 11:02         ` Jan Kratochvil
2004-11-16 14:01           ` Miklos Szeredi
2004-11-16 16:33             ` Greg KH
2004-11-16 16:45               ` Miklos Szeredi
2004-11-16 17:03                 ` Greg KH
2004-11-16 17:50                   ` Miklos Szeredi
2004-11-16 17:58                     ` Greg KH
2004-11-16 19:09                       ` Miklos Szeredi
2004-11-16 19:16                         ` Greg KH
2004-11-16 19:30                           ` Miklos Szeredi
2004-11-16 19:38                             ` Greg KH
2004-11-16 19:24                         ` Jan Engelhardt
2004-11-16 19:32                           ` Miklos Szeredi
2004-11-16 19:42                             ` Anton Altaparmakov
2004-11-16 19:48                               ` Jan Engelhardt
2004-11-16 20:12                               ` Miklos Szeredi
2004-11-17 15:42               ` Miklos Szeredi
2004-11-17 16:57                 ` Nikita Danilov
2004-11-17 17:10                   ` Jan Engelhardt
2004-11-17 17:33                     ` Nikita Danilov
2004-11-17 17:38                       ` Jan Engelhardt
2004-11-17 17:58                         ` Nikita Danilov
2004-11-17 18:09                           ` Jan Engelhardt
2004-11-17 19:58                             ` Mike Waychison
2004-11-17 18:53                           ` [PATCH] " Al Viro
2004-11-17 17:56                   ` Miklos Szeredi
2004-11-17 18:11                     ` Greg KH
2004-11-17 18:17                       ` Miklos Szeredi
2004-11-17 18:20                     ` Nikita Danilov
2004-11-17 17:52                 ` Greg KH
2004-11-17 15:36             ` Alan Cox
2004-11-17 21:37               ` Bryan Henderson
2004-11-17 19:00 ` Pavel Machek
2004-11-17 19:45   ` Miklos Szeredi
2004-11-17 20:44     ` Pavel Machek
2004-11-18  8:17       ` Miklos Szeredi
2004-11-18 14:46         ` Pavel Machek
2004-11-21  7:42           ` Miklos Szeredi
2004-11-21  7:50             ` Miklos Szeredi
2004-11-21  9:50             ` Jan Hudec
2004-11-21 10:31               ` Miklos Szeredi
2004-11-21 10:39                 ` Jan Hudec
2004-11-21 11:29                   ` Miklos Szeredi
2004-11-21 11:53                     ` Anton Altaparmakov
2004-11-21 12:01                       ` Miklos Szeredi
2004-11-21 18:13             ` Pavel Machek
2004-11-22 16:12               ` Miklos Szeredi
2004-11-18 17:00         ` Bryan Henderson
2004-11-18 17:14           ` Miklos Szeredi
2004-11-18 18:49             ` Bryan Henderson
2004-11-18 19:12               ` Miklos Szeredi
2004-11-19  7:01               ` Jan Engelhardt
2004-11-20 12:00             ` Jan Hudec
2004-11-18 17:12         ` Bryan Henderson
2004-11-18 17:28           ` Miklos Szeredi
2004-11-18 18:01             ` Linus Torvalds
2004-11-18 17:29               ` Alan Cox
2004-11-18 18:55                 ` Linus Torvalds
2004-11-18 19:28                   ` Miklos Szeredi
2004-11-19  9:46                     ` Pavel Machek
2004-11-18 20:57                   ` Andrew Morton
2004-11-24  6:20                     ` Daniel Phillips
2004-11-24 12:15                 ` Avi Kivity
2004-11-24 13:05                   ` Miklos Szeredi
     [not found]                     ` <200411242001.59504.oliver@neukum.org>
2004-11-24 19:20                       ` Miklos Szeredi
2004-11-25  6:26                     ` Jan Hudec
2004-11-25  7:29                       ` Miklos Szeredi
2004-11-25  7:47                         ` Jan Hudec
2004-11-25  9:15                           ` Miklos Szeredi
2004-11-25  9:54                           ` Pavel Machek
2004-11-30 18:44                     ` Avi Kivity
2004-11-30 19:16                       ` Miklos Szeredi
2004-11-30 19:55                         ` Avi Kivity
2004-11-30 21:13                           ` Miklos Szeredi
2004-11-30 21:37                             ` Avi Kivity [this message]
2004-11-30 21:58                               ` Miklos Szeredi
2004-11-30 22:57                                 ` Avi Kivity
2004-11-30 23:19                                   ` Miklos Szeredi
2004-12-15 17:55                                     ` Avi Kivity
2004-12-15 21:49                                       ` Miklos Szeredi
2004-12-03 22:07                               ` Daniel Phillips
2004-12-15 17:45                                 ` Avi Kivity
2004-12-01  7:16                             ` Jan Hudec
2004-12-01 13:35                               ` Miklos Szeredi
2004-11-30 21:54                       ` Pavel Machek
2004-11-18 18:21               ` Miklos Szeredi
2004-11-18 18:31                 ` Linus Torvalds
2004-11-18 18:56                   ` Miklos Szeredi
2004-11-18 19:16                     ` Linus Torvalds
2004-11-18 19:33                       ` Miklos Szeredi
2004-11-18 19:43                         ` Linus Torvalds
2004-11-18 20:05                           ` Miklos Szeredi
2004-11-18 21:06                     ` Andrew Morton
2004-11-18 21:33                       ` Miklos Szeredi
2004-11-19 11:27                       ` Miklos Szeredi
2004-11-27 17:07                         ` Rik van Riel
2004-11-27 17:13                           ` Pavel Machek
2004-12-03 22:07                           ` Daniel Phillips
2004-11-18 20:16                   ` Elladan
2004-11-18 18:28               ` Jamie Lokier
2004-11-18 18:47                 ` Linus Torvalds
2004-11-18 19:12             ` Bryan Henderson
2004-11-18 19:51               ` Miklos Szeredi
2004-11-18 22:00                 ` Jan Hudec

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41ACE816.50104@argo.co.il \
    --to=avi@argo.co.il \
    --cc=akpm@osdl.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=hbryan@us.ibm.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=pavel@ucw.cz \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).