From: Daniel Phillips <phillips@phunq.net>
To: David Howells <dhowells@redhat.com>
Cc: Trond.Myklebust@netapp.com, chuck.lever@oracle.com,
casey@schaufler-ca.com, nfsv4@linux-nfs.org,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
selinux@tycho.nsa.gov, linux-security-module@vger.kernel.org,
Trond.Myklebust@netapp.com
Subject: Re: [PATCH 00/37] Permit filesystem local caching
Date: Fri, 22 Feb 2008 14:25:47 -0800 [thread overview]
Message-ID: <200802221425.48535.phillips@phunq.net> (raw)
In-Reply-To: <20089.1203684531@redhat.com>
On Friday 22 February 2008 04:48, David Howells wrote:
> > But looking up the object in the cache should be nearly free - much less
> > than a microsecond per block.
>
> The problem is that you have to do a database lookup of some sort, possibly
> involving several synchronous disk operations.
Right, so the obvious optimization strategy for this corner of it is to
decimate the synchronous disk ops for the average case, for which there
are a variety of options, one of which you already suggested.
> CacheFiles does a disk lookup by taking the key given to it by NFS, turning it
> into a set of file or directory names, and doing a short pathwalk to the target
> cache file. Throwing in extra indices won't necessarily help. What matters is
> how quick the backing filesystem is at doing lookups. As it turns out, Ext3 is
> a fair bit better then BTRFS when the disk cache is cold.
All understood. I am eventually going to suggest cutting the backing
filesystem entirely out of the picture, with a view to improving both
efficiency and transparency, hopefully with a code size reduction as
well. But you are up and running with the filesystem approach, enough
to tackle the basic algorithm questions, which is worth a lot.
I really do not like idea of force fitting this cache into a generic
vfs model. Sun was collectively smoking some serious crack when they
cooked that one up. But there is also the ageless principle "isness is
more important than niceness".
> > > The metadata problem is quite a tricky one since it increases with the
> > > number of files you're dealing with. As things stand in my patches, when
> > > NFS, for example, wants to access a new inode, it first has to go to the
> > > server to lookup the NFS file handle, and only then can it go to the cache
> > > to find out if there's a matching object in the case.
> >
> > So without the persistent cache it can omit the LOOKUP and just send the
> > filehandle as part of the READ?
>
> What 'it'? Note that the get the filehandle, you have to do a LOOKUP op. With
> the cache, we could actually cache the results of lookups that we've done,
> however, we don't know that the results are still valid without going to the
> server:-/
What I was trying to say. It => the cache logic.
> AFS has a way around that - it versions its vnode (inode) IDs.
Which would require a change to NFS, not an option because you hope to
work with standard servers? Of course with years to think about this,
the required protocol changes were put into v4. Not.
/me hopes for an NFS hack to show up and explain the thinking there
Actually, there are many situations where changing both the client (you
must do that anyway) and the server is logistically practical. In fact
that is true for all actual use cases I know of for this cache model.
So elaborating the protocol is not an option to reject out of hand. A
hack along those lines could (should?) be provided as an opportunistic
option.
Have you completely exhausted optimization ideas for the file handle
lookup?
> > > The reason my client going to my server is so quick is that the server has
> > > the dcache and the pagecache preloaded, so that across-network lookup
> > > operations are really, really quick, as compared to the synchronous
> > > slogging of the local disk to find the cache object.
> >
> > Doesn't that just mean you have to preload the lookup table for the
> > persistent cache so you can determine whether you are caching the data
> > for a filehandle without going to disk?
>
> Where "lookup table" == "dcache". That would be good yes. cachefilesd
> prescans all the files in the cache, which ought to do just that, but it
> doesn't seem to be very effective. I'm not sure why.
RCU? Anyway, it is something to be tracked down and put right.
> > Your big can-t-get-there-from-here is the round trip to the server to
> > determine whether you should read from the local cache. Got any ideas?
>
> I'm not sure what you mean. Your statement should probably read "... to
> determine _what_ you should read from the local cache".
What I tried to say. So still... got any ideas? That extra synchronous
network round trip is a killer. Can it be made streaming/async to keep
throughput healthy?
> > And where is the Trond-meister in all of this?
>
> Keeping quiet as far as I can tell.
/me does the Trond summoning dance
Daniel
next prev parent reply other threads:[~2008-02-22 22:27 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-20 16:05 [PATCH 00/37] Permit filesystem local caching David Howells
2008-02-20 16:06 ` [PATCH 01/37] KEYS: Increase the payload size when instantiating a key David Howells
2008-02-20 16:06 ` [PATCH 02/37] KEYS: Check starting keyring as part of search David Howells
2008-02-20 16:06 ` [PATCH 03/37] KEYS: Allow the callout data to be passed as a blob rather than a string David Howells
2008-02-20 16:06 ` [PATCH 04/37] KEYS: Add keyctl function to get a security label David Howells
2008-02-20 16:06 ` [PATCH 05/37] Security: Change current->fs[ug]id to current_fs[ug]id() David Howells
2008-02-20 16:06 ` [PATCH 06/37] Security: Separate task security context from task_struct David Howells
2008-02-22 4:47 ` Casey Schaufler
2008-02-20 16:06 ` [PATCH 07/37] Security: De-embed task security record from task and use refcounting David Howells
2008-02-22 4:57 ` Casey Schaufler
2008-02-20 16:06 ` [PATCH 08/37] Security: Add a kernel_service object class to SELinux David Howells
2008-02-20 16:06 ` [PATCH 09/37] Security: Allow kernel services to override LSM settings for task actions David Howells
2008-02-22 5:06 ` Casey Schaufler
2008-02-22 13:06 ` David Howells
2008-02-20 16:06 ` [PATCH 10/37] Security: Make NFSD work with detached security David Howells
2008-02-20 16:06 ` [PATCH 11/37] FS-Cache: Release page->private after failed readahead David Howells
2008-02-20 16:07 ` [PATCH 12/37] FS-Cache: Recruit a couple of page flags for cache management David Howells
2008-02-20 16:07 ` [PATCH 13/37] FS-Cache: Provide an add_wait_queue_tail() function David Howells
2008-02-20 16:07 ` [PATCH 14/37] FS-Cache: Generic filesystem caching facility David Howells
2008-02-20 16:07 ` [PATCH 15/37] CacheFiles: Add missing copy_page export for ia64 David Howells
2008-02-20 16:07 ` [PATCH 16/37] CacheFiles: Be consistent about the use of mapping vs file->f_mapping in Ext3 David Howells
2008-02-20 16:07 ` [PATCH 17/37] CacheFiles: Add a hook to write a single page of data to an inode David Howells
2008-02-20 16:07 ` [PATCH 18/37] CacheFiles: Permit the page lock state to be monitored David Howells
2008-02-20 16:07 ` [PATCH 19/37] CacheFiles: Export things for CacheFiles David Howells
2008-02-20 16:07 ` [PATCH 20/37] CacheFiles: A cache that backs onto a mounted filesystem David Howells
2008-02-20 16:07 ` [PATCH 21/37] NFS: Add comment banners to some NFS functions David Howells
2008-02-20 16:07 ` [PATCH 22/37] NFS: Add FS-Cache option bit and debug bit David Howells
2008-02-20 16:08 ` [PATCH 23/37] NFS: Permit local filesystem caching to be enabled for NFS David Howells
2008-02-20 16:08 ` [PATCH 24/37] NFS: Register NFS for caching and retrieve the top-level index David Howells
2008-02-20 16:08 ` [PATCH 25/37] NFS: Define and create server-level objects David Howells
2008-02-20 16:08 ` [PATCH 26/37] NFS: Define and create superblock-level objects David Howells
2008-02-20 16:08 ` [PATCH 27/37] NFS: Define and create inode-level cache objects David Howells
2008-02-20 16:08 ` [PATCH 28/37] NFS: Use local disk inode cache David Howells
2008-02-20 16:08 ` [PATCH 29/37] NFS: Invalidate FsCache page flags when cache removed David Howells
2008-02-20 16:08 ` [PATCH 30/37] NFS: Add some new I/O event counters for FS-Cache events David Howells
2008-02-20 16:08 ` [PATCH 31/37] NFS: FS-Cache page management David Howells
2008-02-20 16:08 ` [PATCH 32/37] NFS: Add read context retention for FS-Cache to call back with David Howells
2008-02-20 16:08 ` [PATCH 33/37] NFS: nfs_readpage_async() needs to be accessible as a fallback for local caching David Howells
2008-02-20 16:09 ` [PATCH 34/37] NFS: Read pages from FS-Cache into an NFS inode David Howells
2008-02-20 16:09 ` [PATCH 35/37] NFS: Store pages from an NFS inode into a local cache David Howells
2008-02-20 16:09 ` [PATCH 36/37] NFS: Display local caching state David Howells
2008-02-20 16:09 ` [PATCH 37/37] NFS: Add mount options to enable local caching on NFS David Howells
2008-02-20 19:58 ` [PATCH 00/37] Permit filesystem local caching Serge E. Hallyn
2008-02-20 20:11 ` David Howells
2008-02-21 3:07 ` Daniel Phillips
2008-02-21 12:31 ` David Howells
2008-02-21 14:55 ` David Howells
2008-02-21 15:17 ` Kevin Coffman
2008-02-21 22:44 ` Daniel Phillips
2008-02-21 22:52 ` Muntz, Daniel
2008-02-22 0:07 ` David Howells
2008-02-22 0:57 ` Daniel Phillips
2008-02-22 12:48 ` David Howells
2008-02-22 22:25 ` Daniel Phillips [this message]
2008-02-23 1:22 ` David Howells
2008-02-21 23:33 ` David Howells
2008-02-22 13:52 ` Chris Mason
2008-02-22 16:12 ` David Howells
2008-02-22 16:47 ` David Howells
2008-02-22 16:14 ` David Howells
[not found] ` <200802251401.16413.phillips@phunq.net>
2008-02-25 23:19 ` David Howells
2008-02-26 0:43 ` Daniel Phillips
2008-02-26 2:00 ` David Howells
2008-02-26 10:26 ` Daniel Phillips
2008-02-26 14:33 ` David Howells
2008-02-26 19:43 ` Daniel Phillips
2008-02-26 21:09 ` David Howells
-- strict thread matches above, loose matches on Subject: below --
2008-02-22 16:01 Rick Macklem
2008-02-08 16:51 David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200802221425.48535.phillips@phunq.net \
--to=phillips@phunq.net \
--cc=Trond.Myklebust@netapp.com \
--cc=casey@schaufler-ca.com \
--cc=chuck.lever@oracle.com \
--cc=dhowells@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=nfsv4@linux-nfs.org \
--cc=selinux@tycho.nsa.gov \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox