public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@zip.com.au>
To: Alexander Viro <viro@math.psu.edu>
Cc: lkml <linux-kernel@vger.kernel.org>
Subject: Re: [prepatch] address_space-based writeback
Date: Wed, 10 Apr 2002 12:16:26 -0700	[thread overview]
Message-ID: <3CB48F8A.DF534834@zip.com.au> (raw)
In-Reply-To: <3CB4203D.C3BE7298@zip.com.au> <Pine.GSO.4.21.0204100725410.15110-100000@weyl.math.psu.edu>

Alexander Viro wrote:
> 
> On Wed, 10 Apr 2002, Andrew Morton wrote:
> 
> >
> > This is a largish patch which makes some fairly deep changes.  It's
> > currently at the "wow, it worked" stage.  Most of it is fairly
> > mature code, but some conceptual changes were recently made.
> > Hopefully it'll be in respectable state in a few days, but I'd
> > like people to take a look.
> >
> > The idea is: all writeback is against address_spaces.  All dirty data
> > has the dirty bit set against its page.  So all dirty data is
> > accessible by
> >
> >       superblock list
> >               -> superblock dirty inode list
> >                       -> inode mapping's dirty page list.
> >                               -> page_buffers(page) (maybe)
> 
> Wait.

Hi, Al.  Nothing has really changed wrt the things to which you
refer.  ie: they would already be a problem.  The relationships
between dirty pages, address_spaces, inodes and superblocks
are unchanged, except for one thing:  __mark_inode_dirty will
now attach blockdev inodes to the dummy blockdev's dirty
inodes list.

The main thing which is being changed is buffers. The assumption is
that buffers can be hunted down via
superblocklist->superblock->dirty_inode->i_mapping->writeback_mapping,
not via the global LRU.

>  You are assuming that all address_spaces happen to be ->i_mapping of
> some inode.

Sorry, the above diagram is not really accurate.  The sync/kupdate/bdflush
writeback path is really:

	superblock list
		-> superblock dirty inode list
			->inode->i_mapping->a_ops->writeback_mapping(mapping)

So core kernel does not actually assume that the to-be-written
pages are accessible via inode->i_mapping->dirty_pages.

I believe that the object relationship you're describing is
that the inode->i_mapping points to the main address_space,
and the `host' field of both the main and private address_spaces
both point at the same inode?  That the inode owns two
address_spaces?

That's OK.  When a page is dirtied, the kernel will attach
that page to the private address_space's dirty pages list and
will attach the common inode to its superblock's dirty inodes list.

For writeback, core kernel will perform

	inode->i_mapping->writeback_mapping(mapping, nr_pages)

which will hit the inode's main address_space's writeback_mapping()
method will do:

my_writeback_mapping(mapping, nr_pages)
{
	generic_writeback_mapping(mapping, nr_pages);
	mapping->host->private_mapping->a_ops->writeback_mapping(
		mapping->host->private_mapping, nr_pages);
}

> ...
> What's more, I wonder how well does your scheme work with ->i_mapping
> to a different inode's ->i_data (CODA et.al., file access to block devices).

Presumably, those different inodes have a superblock?  In that
case set_page_dirty will mark that inode dirty wrt its own
superblock.  set_page_dirty() is currently an optional a_op,
but it's not obvious that there will be a need for that.

The one thing which does worry me a bit is why __mark_inode_dirty
tests for a null ->i_sb.  If the inode doesn't have a superblock
then its pages are hidden from the writeback functions.

This is not fatal per-se.  The pages are still visible to the VM
via the LRU, and presumably the filesystem knows how to sync
its own stuff.  But for memory-balancing and throttling purposes,
I'd prefer that the null ->i_sb not be allowed.  Where can this
occur?

> BTW, CODA-like beasts can have several local filesystems for cache - i.e.
> ->i_mapping for dirty inodes from the same superblock can ultimately go
> to different queues.

When a page is dirtied, those inodes will be attached to their
->i_sb's dirty inode list; I haven't changed any of that.

>  Again, the same goes for stuff like
> dd if=foo of=/dev/floppy - you get dirty inode of /dev/floppy with ->i_mapping
> pointing to bdev filesystem and queue we end up with having nothing in common
> with that of root fs' device.

OK.  What has changed here is that the resulting mark_buffer_dirty
calls will now set the page dirty, and attach the page to its
mapping's dirty_pages, and attach its mapping's host to
mapping->host->i_sb->s_dirty.

So writeback for the floppy device is no longer via the
buffer LRU.  It's via

	dummy blockdev superblock
		-> blockdev's inode
			->i_mapping
				->writeback_mapping

> I'd really like to see where are you going with all that stuff - if you
> expect some correspondence between superblocks and devices, you've are
> walking straight into major PITA.

Hopefully, none of that has changed.  It's just the null inode->i_sb
which is a potential problem.

-

  reply	other threads:[~2002-04-10 20:18 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-04-10 11:21 [prepatch] address_space-based writeback Andrew Morton
2002-04-10 11:34 ` Alexander Viro
2002-04-10 19:16   ` Andrew Morton [this message]
2002-04-10 20:53     ` Alexander Viro
2002-04-10 22:12     ` Jan Harkes
2002-04-10 21:44       ` Andrew Morton
2002-04-10 22:56         ` Anton Altaparmakov
2002-04-10 22:31           ` Andrew Morton
2002-04-11 20:20           ` Linus Torvalds
2002-04-11 20:41             ` Alexander Viro
2002-04-11 21:27               ` Andrew Morton
2002-04-11 22:55                 ` Andreas Dilger
2002-04-11 22:49                   ` Andrew Morton
2002-04-12  0:12                     ` Linus Torvalds
2002-04-11 23:10                   ` Christoph Hellwig
2002-04-11 23:22                 ` Anton Altaparmakov
2002-04-11 23:03                   ` Andrew Morton
2002-04-12  4:19                   ` Bill Davidsen
2002-04-12  1:15             ` Anton Altaparmakov
2002-04-12  1:37               ` Linus Torvalds
2002-04-12  7:57                 ` Anton Altaparmakov
2002-04-27 15:53                   ` Jan Harkes
2002-04-28  3:03                     ` Anton Altaparmakov
2002-04-29  9:03                       ` Nikita Danilov
2002-04-29 11:11                         ` Anton Altaparmakov
2002-04-29 11:59                           ` Nikita Danilov
2002-04-29 12:34                             ` Anton Altaparmakov
2002-04-29 13:01                               ` Christoph Hellwig
2002-04-30 17:19                             ` Denis Vlasenko
2002-04-30 13:15                               ` john slee
2002-04-30 13:24                                 ` Billy O'Connor
2002-04-30 13:36                                   ` jlnance
2002-04-30 13:40                                 ` Keith Owens
2002-05-01 19:18                                   ` Denis Vlasenko
2002-05-02  8:49                                     ` Anton Altaparmakov
2002-05-03 15:35                                       ` Denis Vlasenko
2002-05-03 12:49                                         ` Helge Hafting
2002-05-03 22:47                                           ` Denis Vlasenko
2002-05-03 21:50                                             ` Anton Altaparmakov
2002-05-05  0:46                                               ` Denis Vlasenko
2002-05-03  7:56                                     ` Pavel Machek
2002-05-03 14:48                                     ` Rob Landley
2002-05-05  0:42                                       ` Denis Vlasenko
2002-04-30 16:12                                 ` Peter Wächtler
2002-04-10 23:02         ` Jan Harkes
2002-04-10 19:29 ` Jeremy Jackson
2002-04-10 19:41   ` Andrew Morton
2002-04-15  8:47 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3CB48F8A.DF534834@zip.com.au \
    --to=akpm@zip.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@math.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox