linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Daniel Phillips <phillips@phunq.net>
Cc: Trond.Myklebust@netapp.com, nfsv4@linux-nfs.org,
	linux-kernel@vger.kernel.org, dhowells@redhat.com,
	linux-fsdevel@vger.kernel.org,
	linux-security-module@vger.kernel.org, selinux@tycho.nsa.gov,
	casey@schaufler-ca.com
Subject: Re: [PATCH 00/37] Permit filesystem local caching
Date: Sat, 23 Feb 2008 01:22:38 +0000	[thread overview]
Message-ID: <25049.1203729758@redhat.com> (raw)
In-Reply-To: <200802221425.48535.phillips@phunq.net>

Daniel Phillips <phillips@phunq.net> wrote:

> I am eventually going to suggest cutting the backing filesystem entirely out
> of the picture,

You still need a database to manage the cache.  A filesystem such as Ext3
makes a very handy database for four reasons:

 (1) It exists and works.

 (2) It has a well defined interface within the kernel.

 (3) I can place my cache on, say, my root partition on my laptop.  I don't
     have to dedicate a partition to the cache.

 (4) Userspace cache management tools (such as cachefilesd) have an already
     existing interface to use: rmdir, unlink, open, getdents, etc..

I do have a cache-on-blockdev thing, but it's basically a wandering tree
filesystem inside.  It is, or was, much faster than ext3 on a clean cache, but
it degrades horribly over time because my free space reclamation sucks - it
gradually randomises the block allocation sequence over time.

So, what would you suggest instead of a backing filesystem?

> I really do not like idea of force fitting this cache into a generic
> vfs model.  Sun was collectively smoking some serious crack when they
> cooked that one up.  But there is also the ageless principle "isness is
> more important than niceness".

What do you mean?  I'm not doing it like Sun.  The cache is a side path from
the netfs.  It should be transparent to the user, the VFS and the server.

The only place it might not be transparent is that you might to have to
instruct the netfs mount to use the cache.  I'd prefer to do it some other way
than passing parameters to mount, though, as (1) this causes fun with NIS
distributed automounter maps, and (2) people are asking for a finer grain of
control than per-mountpoint.  Unfortunately, I can't seem to find a way to do
it that's acceptable to Al.

> Which would require a change to NFS, not an option because you hope to
> work with standard servers?  Of course with years to think about this,
> the required protocol changes were put into v4.  Not.

I don't think there's much I can do about NFS.  It requires the filesystem
from which the NFS server is dealing to have inode uniquifiers, which are then
incorporated into the file handle.  I don't think the NFS protocol itself
needs to change to support this.

> Have you completely exhausted optimization ideas for the file handle
> lookup?

No, but there aren't many.  CacheFiles doesn't actually do very much, and it's
hard to reduce that not very much.  The most obvious thing is to prepopulate
the dcache, but that's at the expense of memory usage.

Actually, if I cache the name => FH mapping I used last time, I can make a
start on looking up in the cache whilst simultaneously accessing the server.
If what's on the server has changed, I can ditch the speculative cache lookup
I was making and start a new cache lookup.

However, storing directory entries has penalties of its own, though it'll be
necesary if we want to do disconnected operation.

> > Where "lookup table" == "dcache".  That would be good yes.  cachefilesd
> > prescans all the files in the cache, which ought to do just that, but it
> > doesn't seem to be very effective.  I'm not sure why.
> 
> RCU?  Anyway, it is something to be tracked down and put right.

cachefilesd runs in userspace.  It's possible it isn't doing enough to preload
all the metadata.

> What I tried to say.  So still... got any ideas?  That extra synchronous
> network round trip is a killer.  Can it be made streaming/async to keep
> throughput healthy?

That's a per-netfs thing.  With the test rig I've got, it's going to the
on-disk cache that's the killer.  Going over the network is much faster.

See the results I posted.  For the tarball load, and using Ext3 to back the
cache:

	Cold NFS cache, no disk cache:		0m22.734s
	Warm on-disk cache, cold pagecaches:	1m54.350s

The problem is reading using tar is a worst case workload for this.  Everything
it does is pretty much completely synchronous.

One thing that might help is if things like tar and find can be made to use
fadvise() on directories to hint to the filesystem (NFS, AFS, whatever) that
it's going to access every file in those directories.

Certainly AFS could make use of that: the directory is read as a file, and the
netfs then parses the file to get a list of vnode IDs that that directory
points to.  It could then do bulk status fetch operations to instantiate the
inodes 50 at a time.

I don't know whether NFS could use it.  Someone like Trond or SteveD or Chuck
would have to answer that.

David

  parent reply	other threads:[~2008-02-23  1:22 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-20 16:05 [PATCH 00/37] Permit filesystem local caching David Howells
2008-02-20 16:06 ` [PATCH 01/37] KEYS: Increase the payload size when instantiating a key David Howells
2008-02-20 16:06 ` [PATCH 02/37] KEYS: Check starting keyring as part of search David Howells
2008-02-20 16:06 ` [PATCH 03/37] KEYS: Allow the callout data to be passed as a blob rather than a string David Howells
2008-02-20 16:06 ` [PATCH 04/37] KEYS: Add keyctl function to get a security label David Howells
2008-02-20 16:06 ` [PATCH 05/37] Security: Change current->fs[ug]id to current_fs[ug]id() David Howells
2008-02-20 16:06 ` [PATCH 06/37] Security: Separate task security context from task_struct David Howells
2008-02-22  4:47   ` Casey Schaufler
2008-02-20 16:06 ` [PATCH 07/37] Security: De-embed task security record from task and use refcounting David Howells
2008-02-22  4:57   ` Casey Schaufler
2008-02-20 16:06 ` [PATCH 08/37] Security: Add a kernel_service object class to SELinux David Howells
2008-02-20 16:06 ` [PATCH 09/37] Security: Allow kernel services to override LSM settings for task actions David Howells
2008-02-22  5:06   ` Casey Schaufler
2008-02-22 13:06     ` David Howells
2008-02-20 16:06 ` [PATCH 10/37] Security: Make NFSD work with detached security David Howells
2008-02-20 16:06 ` [PATCH 11/37] FS-Cache: Release page->private after failed readahead David Howells
2008-02-20 16:07 ` [PATCH 12/37] FS-Cache: Recruit a couple of page flags for cache management David Howells
2008-02-20 16:07 ` [PATCH 13/37] FS-Cache: Provide an add_wait_queue_tail() function David Howells
2008-02-20 16:07 ` [PATCH 14/37] FS-Cache: Generic filesystem caching facility David Howells
2008-02-20 16:07 ` [PATCH 15/37] CacheFiles: Add missing copy_page export for ia64 David Howells
2008-02-20 16:07 ` [PATCH 16/37] CacheFiles: Be consistent about the use of mapping vs file->f_mapping in Ext3 David Howells
2008-02-20 16:07 ` [PATCH 17/37] CacheFiles: Add a hook to write a single page of data to an inode David Howells
2008-02-20 16:07 ` [PATCH 18/37] CacheFiles: Permit the page lock state to be monitored David Howells
2008-02-20 16:07 ` [PATCH 19/37] CacheFiles: Export things for CacheFiles David Howells
2008-02-20 16:07 ` [PATCH 20/37] CacheFiles: A cache that backs onto a mounted filesystem David Howells
2008-02-20 16:07 ` [PATCH 21/37] NFS: Add comment banners to some NFS functions David Howells
2008-02-20 16:07 ` [PATCH 22/37] NFS: Add FS-Cache option bit and debug bit David Howells
2008-02-20 16:08 ` [PATCH 23/37] NFS: Permit local filesystem caching to be enabled for NFS David Howells
2008-02-20 16:08 ` [PATCH 24/37] NFS: Register NFS for caching and retrieve the top-level index David Howells
2008-02-20 16:08 ` [PATCH 25/37] NFS: Define and create server-level objects David Howells
2008-02-20 16:08 ` [PATCH 26/37] NFS: Define and create superblock-level objects David Howells
2008-02-20 16:08 ` [PATCH 27/37] NFS: Define and create inode-level cache objects David Howells
2008-02-20 16:08 ` [PATCH 28/37] NFS: Use local disk inode cache David Howells
2008-02-20 16:08 ` [PATCH 29/37] NFS: Invalidate FsCache page flags when cache removed David Howells
2008-02-20 16:08 ` [PATCH 30/37] NFS: Add some new I/O event counters for FS-Cache events David Howells
2008-02-20 16:08 ` [PATCH 31/37] NFS: FS-Cache page management David Howells
2008-02-20 16:08 ` [PATCH 32/37] NFS: Add read context retention for FS-Cache to call back with David Howells
2008-02-20 16:08 ` [PATCH 33/37] NFS: nfs_readpage_async() needs to be accessible as a fallback for local caching David Howells
2008-02-20 16:09 ` [PATCH 34/37] NFS: Read pages from FS-Cache into an NFS inode David Howells
2008-02-20 16:09 ` [PATCH 35/37] NFS: Store pages from an NFS inode into a local cache David Howells
2008-02-20 16:09 ` [PATCH 36/37] NFS: Display local caching state David Howells
2008-02-20 16:09 ` [PATCH 37/37] NFS: Add mount options to enable local caching on NFS David Howells
2008-02-20 19:58 ` [PATCH 00/37] Permit filesystem local caching Serge E. Hallyn
2008-02-20 20:11 ` David Howells
2008-02-21  3:07 ` Daniel Phillips
2008-02-21 12:31 ` David Howells
2008-02-21 14:55 ` David Howells
2008-02-21 15:17   ` Kevin Coffman
2008-02-21 22:44   ` Daniel Phillips
2008-02-21 22:52     ` Muntz, Daniel
2008-02-22  0:07   ` David Howells
2008-02-22  0:57     ` Daniel Phillips
2008-02-22 12:48     ` David Howells
2008-02-22 22:25       ` Daniel Phillips
2008-02-23  1:22       ` David Howells [this message]
2008-02-21 23:33 ` David Howells
2008-02-22 13:52   ` Chris Mason
2008-02-22 16:12   ` David Howells
2008-02-22 16:47   ` David Howells
2008-02-25 23:19   ` David Howells
2008-02-26  0:43     ` Daniel Phillips
2008-02-26  2:00     ` David Howells
2008-02-26 10:26       ` Daniel Phillips
2008-02-26 14:33       ` David Howells
2008-02-26 19:43         ` Daniel Phillips
2008-02-26 21:09         ` David Howells
2008-02-22 16:14 ` David Howells
  -- strict thread matches above, loose matches on Subject: below --
2008-02-22 16:01 Rick Macklem
2008-02-08 16:51 David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25049.1203729758@redhat.com \
    --to=dhowells@redhat.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=casey@schaufler-ca.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=nfsv4@linux-nfs.org \
    --cc=phillips@phunq.net \
    --cc=selinux@tycho.nsa.gov \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).