All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <mason@suse.com>
To: Rik van Riel <riel@conectiva.com.br>
Cc: Marcelo Tosatti <marcelo@conectiva.com.br>,
	Xuan Baldauf <xuan--lkml@baldauf.org>,
	linux-kernel@vger.kernel.org, andrea@suse.de,
	"reiserfs-list@namesys.com" <reiserfs-list@namesys.com>
Subject: Re: VM deadlock
Date: Wed, 27 Jun 2001 16:24:10 -0400	[thread overview]
Message-ID: <933130000.993673450@tiny> (raw)
In-Reply-To: <Pine.LNX.4.33L.0106271641570.23373-100000@duckman.distro.conectiva>



On Wednesday, June 27, 2001 04:43:28 PM -0300 Rik van Riel
<riel@conectiva.com.br> wrote:

> On Wed, 27 Jun 2001, Chris Mason wrote:
> 
>> Reiserfs expects write_inode() calls initiated by kswapd to
>> always have sync==0.  Otherwise, kswapd ends up waiting on the
>> log, which isn't what we want at all.
> 
> If you don't have free memory, you are limited to 2 choices:
> 
> 1) wait on IO
> 2) spin endlessly, wasting CPU until the IO is done
> 
> If (1) isn't possible in reiserfs, I'd say something in
> reiserfs needs to be fixed, otherwise you will always
> have problems when the system has lots of dirty mappings
> that need to be written out.
> 

Ok, I need to describe the problem a little better.  reiserfs inodes need
to be logged, which means you have to join/start a transaction in order to
write them.

So, if kswapd tries to write them, it might end up waiting on the log.
Normally this is not a big deal, but almost allocations in reiserfs use
GFP_BUFFER, which means we never end up doing i/o ourselves in
page_launder, and always end up waiting on kswapd.  So, kswapd waits on
reiserfs and reiserfs waits on kswapd (none of these are spin locks ;-)

The work around I've been using is the dirty_inode method.  Whenever
mark_inode_dirty is called, reiserfs logs the dirty inode.  This means
inode changes are _always_ reflected in the buffer cache right away, and
the inode itself is never actually dirty.

So, the only time reiserfs_write_inode needs to do something is for fsync
and/or O_SYNC writes, and all it needs to do is commit the transaction.  

Any time kswapd is calling write_inode, it is just trying to free the inode
struct, and reiserfs can safely ignore the write request, regardless of if
a sync is requested.

-chris


  reply	other threads:[~2001-06-27 20:25 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-06-27 14:27 VM deadlock Xuan Baldauf
2001-06-27 13:11 ` Marcelo Tosatti
2001-06-27 16:13   ` Xuan Baldauf
2001-06-27 15:09 ` Chris Mason
2001-06-27 16:20   ` Xuan Baldauf
2001-06-27 17:43   ` Marcelo Tosatti
2001-06-27 19:36     ` Chris Mason
2001-06-27 19:43       ` Rik van Riel
2001-06-27 20:24         ` Chris Mason [this message]
2001-06-27 20:36           ` Rik van Riel
2001-06-27 20:52             ` Chris Mason
2001-06-28  3:21           ` Andrew Morton
2001-06-28 12:53             ` Chris Mason
2001-06-28 14:08               ` Andrew Morton
2001-06-28 14:25                 ` Chris Mason
2001-06-27 19:50       ` [reiserfs-list] " Xuan Baldauf
2001-06-27 18:16   ` Rik van Riel
2001-06-27 18:38     ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=933130000.993673450@tiny \
    --to=mason@suse.com \
    --cc=andrea@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo@conectiva.com.br \
    --cc=reiserfs-list@namesys.com \
    --cc=riel@conectiva.com.br \
    --cc=xuan--lkml@baldauf.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.