public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <mason@suse.com>
To: Andrew Morton <akpm@zip.com.au>
Cc: Johan Ekenberg <johan@ekenberg.se>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	jack@suse.cz, linux-kernel@vger.kernel.org
Subject: Re: Lockups with 2.4.14 and 2.4.16
Date: Fri, 14 Dec 2001 12:53:00 -0500	[thread overview]
Message-ID: <3845670000.1008352380@tiny> (raw)
In-Reply-To: <3C1A3652.52B989E4@zip.com.au>
In-Reply-To: <000a01c1829f$75daf7a0$050010ac@FUTURE> ,	<000a01c1829f$75daf7a0$050010ac@FUTURE> <3825380000.1008348567@tiny> <3C1A3652.52B989E4@zip.com.au>



On Friday, December 14, 2001 09:26:42 AM -0800 Andrew Morton
<akpm@zip.com.au> wrote:

> Chris Mason wrote:
>> 
>> Ok, Johan sent along stack traces, and the deadlock works a little like
>> this:
>> 
>> linux-2.4.16 + reiserfs + quota v2
>> 
>> kswapd ->
>> prune_icache->dispose_list->dquot_drop->commit_dquot->generic_file_write->
>> mark_inode_dirty->journal_begin-> wait for trans to end
> 
> uh-huh.
> 
>> Some process in the transaction is waiting on kswapd to free ram.
> 
> This is unfamiliar.  Where does a process block on kswapd in this
> manner?  Not __alloc_pages I think.

It kinda blocks on kswapd by default when the process in the transaction
needs to read a block, or allocate one for the commit.  Since kswapd is stuck
waiting on the log, eventually a process holding the transaction will try to
allocate something when there are no pages freeable with GFP_NOFS.

It was much worse when we just had GFP_BUFFER before, but the deadlock is
still there.

>  
>> So, this will hit any journaled FS that uses quotas and logs inodes under
>> during a write.  ext3 doesn't seem to do special things for quota anymore,
>> so it should be affected too.
> 
> mm.. most of the ext3 damage-avoidance hacks are around writepage().

sct talked about how the ext3 data logging code allowed quotas to be
consistent after a crash.  Perhaps this was just in 2.2.x...

> 
>> The only fix I see is to make sure kswapd doesn't run shrink_icache, and to
>> have it done via a dedicated daemon instead.  Does anyone have a better
>> idea?
> 
> Well, we already need to do something like that to prevent the
> abuse of keventd in there.  It appears that somebody had a
> problem with deadlocks doing the inode writeout in kswapd but
> missed the quota problem.
> 
> Is it possible for the quota code to just bale out if PF_MEMALLOC
> is set?  To leave the dquot dirty?

We could change prune_icache to skip inodes with dirty quota fields.  It
already skips dirty inodes, so this isn't a huge change.

I'll try this, and also add kinoded so we can avoid using keventd.  I'm wary
of the affects on the VM if kinoded can't keep up though, so I'd like to keep
the shrink_icache call in kswapd if possible.

Johan, expect this to take at least a week before I suggest installing on
production machines.  Things are very intertwined here, and these changes
will probably have side effects that need dealing with.

Turning quotas off will solve the problem in the short term.

-chris


  reply	other threads:[~2001-12-14 17:58 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-12-11 23:29 Lockups with 2.4.14 and 2.4.16 Johan Ekenberg
2001-12-11 23:47 ` Alan Cox
2001-12-11 23:56   ` SV: " Johan Ekenberg
2001-12-12  0:36     ` Alan Cox
2001-12-14 16:49     ` Chris Mason
2001-12-14 17:26       ` Andrew Morton
2001-12-14 17:53         ` Chris Mason [this message]
2001-12-14 18:32           ` Andrea Arcangeli
2001-12-14 18:55             ` Chris Mason
2001-12-14 18:57             ` Andrew Morton
2001-12-14 19:16               ` Andrea Arcangeli
2001-12-20 13:29               ` Chris Mason
     [not found]               ` <1624652704.1008906979@tiny>
     [not found]                 ` <3C22CC54.D4F5B01@zip.com.au>
2001-12-21 13:29                   ` [PATCH] " Chris Mason
2001-12-14 19:26           ` Jan Kara
2001-12-14 19:21         ` Jan Kara
2001-12-12  0:56   ` SV: " Johan Ekenberg
2001-12-12  1:22     ` Alan Cox
2001-12-12  0:12 ` Brad Dameron
2001-12-12  0:47 ` Chris Mason
2001-12-12  1:01   ` SV: " Johan Ekenberg
2001-12-12  1:10     ` Hans Reiser
2001-12-12  1:15     ` Chris Mason
  -- strict thread matches above, loose matches on Subject: below --
2001-12-12  0:38 Johan Ekenberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3845670000.1008352380@tiny \
    --to=mason@suse.com \
    --cc=akpm@zip.com.au \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=jack@suse.cz \
    --cc=johan@ekenberg.se \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox