All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Artem.Bityutskiy@nokia.com
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: forcing write-back from FS - again
Date: Sun, 21 Oct 2007 13:55:26 -0700	[thread overview]
Message-ID: <20071021135526.57db7519.akpm@linux-foundation.org> (raw)
In-Reply-To: <471BB45D.8070509@nokia.com>

On Sun, 21 Oct 2007 23:19:41 +0300 Artem Bityutskiy <Artem.Bityutskiy@nokia.com> wrote:

> Hi Andrew,
> 
> some time ago we were talking about doing write-back from inside a file-system 
> (http://marc.info/?l=linux-kernel&m=119097117713616&w=2). You said that I'm not 
> the only person who needs this, because the same thing is needed for delayed 
> allocation.
> 
> The problem is that if we initiate write-back from prepare_write() and we are 
> having a dirty page lock, we deadlock in write_cache_pages() which tries to 
> lock the same page.
> 
> You suggested to enhance struct writeback_control and put page that should be 
> skipped.
> 
> ...
>
> but it does not dot actually work, because if we have two processes forcing 
> write-back from write_page(), they will mutually deadlock (A waits in 
> write_cache_pages() on a page B has locked, B waits on inode or page A has locked).

Yeah, I was just thinking that as I read this ;)
 
> So this way is not ok, do you have any other ideas?
> 
> We could mark page clean temporarily before doing write-back, and mark it dirty 
> again, but this seems to be inefficient (although I'm not sure, need to dig 
> these functions deeper, but they _seem_ to traverse the radix tree and change 
> tags, so marking one page dirty may need to change many tags, but again, I did 
> not really dig tis yet).
> 
> I'd appreciate any suggestions. Thanks!

We could just skip locked pages altogether in writeback.  Perhaps in
WB_SYNC_NONE mode, or perhaps add a new flag in writeback_control to select
this behaviour.

It _should_ be the case that the number of locked-and-dirty pages which
writeback encounters is very small, so skipping locked pages during
writeback-for-memory-flushing won't have any significant effect.  The first
step should be to add a new /proc/vmstat field to count these pages and
then do broad testing (especially on blocksize<pagesize filesystems) to
confirm the theory.

We'll still need to synchronously lock the page in
writeback-for-data-integrity mode though.


  reply	other threads:[~2007-10-21 20:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-21 20:19 forcing write-back from FS - again Artem Bityutskiy
2007-10-21 20:55 ` Andrew Morton [this message]
2007-10-22  8:52   ` Artem Bityutskiy
2007-10-22  9:05     ` Andrew Morton
2007-10-22  9:38       ` Artem Bityutskiy
2007-10-22  9:55         ` Andrew Morton
2007-10-22 10:04           ` Artem Bityutskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071021135526.57db7519.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=Artem.Bityutskiy@nokia.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.