Re: [PATCH 0/5] VFS: Directory level cache cleaning

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Li Wang <liwang@ubuntukylin.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Sage Weil <sage@inktank.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Subject: Re: [PATCH 0/5] VFS: Directory level cache cleaning
Date: Wed, 18 Dec 2013 09:05:03 +1100	[thread overview]
Message-ID: <20131217220503.GA20579@dastard> (raw)
In-Reply-To: <cover.1387205337.git.liwang@ubuntukylin.com>

On Mon, Dec 16, 2013 at 07:00:04AM -0800, Li Wang wrote:
> Currently, Linux only support file system wide VFS
> cache (dentry cache and page cache) cleaning through
> '/proc/sys/vm/drop_caches'. Sometimes this is less
> flexible. The applications may know exactly whether
> the metadata and data will be referenced or not in future,
> a desirable mechanism is to enable applications to
> reclaim the memory of unused cache entries at a finer
> granularity - directory level. This enables applications
> to keep hot metadata and data (to be referenced in the
> future) in the cache, and kick unused out to avoid
> cache thrashing. Another advantage is it is more flexible
> for debugging.
>
> This patch extend the 'drop_caches' interface to
> support directory level cache cleaning and has a complete
> backward compatibility. '{1,2,3}' keeps the same semantics
> as before. Besides, "{1,2,3}:DIRECTORY_PATH_NAME" is allowed
> to recursively clean the caches under DIRECTORY_PATH_NAME.
> For example, 'echo 1:/home/foo/jpg > /proc/sys/vm/drop_caches'
> will clean the page caches of the files inside 'home/foo/jpg'.
> 
> It is easy to demonstrate the advantage of directory level
> cache cleaning. We use a virtual machine configured with
> an Intel(R) Xeon(R) 8-core CPU E5506 @ 2.13GHz, and with 1GB
> memory.  Three directories named '1', '2' and '3' are created,
> with each containing 180000 – 280000 files. The test program
> opens all files in a directory and then tries the next directory.
> The order for accessing the directories is '1', '2', '3',
> '1'.
> 
> The time on accessing '1' on the second time is measured
> with/without cache cleaning, under different file counts.
> With cache cleaning, we clean all cache entries of files
> in '2' before accessing the files in '3'. The results
> are as follows (in seconds),

This sounds like a highly contrived test case. There is no reason
why dentry cache access time would change going from 180k to 280k
files in 3 directories unless you're right at the memory pressure
balance point in terms of cache sizing.

> Note: by default, VFS will move those unreferenced inodes
> into a global LRU list rather than freeing them, for this
> experiment, we modified iput() to force to free inode as well,
> this behavior and related codes are left for further discussion,
> thus not reflected in this patch)
> 
> Number of files:   180000 200000 220000 240000 260000
> Without cleaning:  2.165  6.977  10.032 11.571 13.443
> With cleaning:     1.949  1.906  2.336  2.918  3.651
>
> When the number of files is 180000 in each directory,
> the metadata cache is large enough to buffer all entries
> of three directories, so re-accessing '1' will hit in
> the cache, regardless of whether '2' cleaned up or not.
> As the number of files increases, the cache can now only
> buffer two+ directories. Accessing '3' will result in some
> entries of '1' to be evicted (due to LRU). When re-accessing '1',
> some entries need be reloaded from disk, which is time-consuming.

Ok, so exactly as I thought - your example working set is slightly
larger than what the cache holds. Hence what you are describing is
a cache reclaim threshold effect: something you can avoid with
/proc/sys/vm/vfs_cache_pressure.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2013-12-17 22:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-16 15:00 [PATCH 0/5] VFS: Directory level cache cleaning Li Wang
2013-12-16 15:00 ` [PATCH 1/5] VFS: Convert drop_caches to accept string Li Wang
2013-12-16 15:00 ` [PATCH 2/5] VFS: Convert sysctl_drop_caches to string Li Wang
2013-12-16 15:00 ` [PATCH 3/5] VFS: Add the declaration of shrink_pagecache_parent Li Wang
2013-12-16 15:00 ` [PATCH 4/5] VFS: Add shrink_pagecache_parent Li Wang
2013-12-16 15:00 ` [PATCH 5/5] VFS: Extend drop_caches sysctl handler to allow directory level cache cleaning Li Wang
2013-12-16 17:45 ` [PATCH 0/5] VFS: Directory " Cong Wang
2013-12-17  3:08   ` Li Wang
2013-12-17  3:58     ` Matthew Wilcox
2013-12-17  7:23       ` Li Wang
2013-12-17  9:12         ` Li Zefan
2013-12-17  9:31           ` Li Wang
2013-12-18  1:26             ` Li Zefan
2013-12-17 13:55           ` Michal Hocko
2013-12-17 22:05 ` Dave Chinner [this message]
2013-12-18  1:36   ` Li Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131217220503.GA20579@dastard \
    --to=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liwang@ubuntukylin.com \
    --cc=sage@inktank.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yunchuanwen@ubuntukylin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).