From: Dave Chinner <david@fromorbit.com>
To: Li Wang <liwang@ubuntukylin.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Sage Weil <sage@inktank.com>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Subject: Re: [PATCH 0/5] VFS: Directory level cache cleaning
Date: Wed, 18 Dec 2013 09:05:03 +1100 [thread overview]
Message-ID: <20131217220503.GA20579@dastard> (raw)
In-Reply-To: <cover.1387205337.git.liwang@ubuntukylin.com>
On Mon, Dec 16, 2013 at 07:00:04AM -0800, Li Wang wrote:
> Currently, Linux only support file system wide VFS
> cache (dentry cache and page cache) cleaning through
> '/proc/sys/vm/drop_caches'. Sometimes this is less
> flexible. The applications may know exactly whether
> the metadata and data will be referenced or not in future,
> a desirable mechanism is to enable applications to
> reclaim the memory of unused cache entries at a finer
> granularity - directory level. This enables applications
> to keep hot metadata and data (to be referenced in the
> future) in the cache, and kick unused out to avoid
> cache thrashing. Another advantage is it is more flexible
> for debugging.
>
> This patch extend the 'drop_caches' interface to
> support directory level cache cleaning and has a complete
> backward compatibility. '{1,2,3}' keeps the same semantics
> as before. Besides, "{1,2,3}:DIRECTORY_PATH_NAME" is allowed
> to recursively clean the caches under DIRECTORY_PATH_NAME.
> For example, 'echo 1:/home/foo/jpg > /proc/sys/vm/drop_caches'
> will clean the page caches of the files inside 'home/foo/jpg'.
>
> It is easy to demonstrate the advantage of directory level
> cache cleaning. We use a virtual machine configured with
> an Intel(R) Xeon(R) 8-core CPU E5506 @ 2.13GHz, and with 1GB
> memory. Three directories named '1', '2' and '3' are created,
> with each containing 180000 – 280000 files. The test program
> opens all files in a directory and then tries the next directory.
> The order for accessing the directories is '1', '2', '3',
> '1'.
>
> The time on accessing '1' on the second time is measured
> with/without cache cleaning, under different file counts.
> With cache cleaning, we clean all cache entries of files
> in '2' before accessing the files in '3'. The results
> are as follows (in seconds),
This sounds like a highly contrived test case. There is no reason
why dentry cache access time would change going from 180k to 280k
files in 3 directories unless you're right at the memory pressure
balance point in terms of cache sizing.
> Note: by default, VFS will move those unreferenced inodes
> into a global LRU list rather than freeing them, for this
> experiment, we modified iput() to force to free inode as well,
> this behavior and related codes are left for further discussion,
> thus not reflected in this patch)
>
> Number of files: 180000 200000 220000 240000 260000
> Without cleaning: 2.165 6.977 10.032 11.571 13.443
> With cleaning: 1.949 1.906 2.336 2.918 3.651
>
> When the number of files is 180000 in each directory,
> the metadata cache is large enough to buffer all entries
> of three directories, so re-accessing '1' will hit in
> the cache, regardless of whether '2' cleaned up or not.
> As the number of files increases, the cache can now only
> buffer two+ directories. Accessing '3' will result in some
> entries of '1' to be evicted (due to LRU). When re-accessing '1',
> some entries need be reloaded from disk, which is time-consuming.
Ok, so exactly as I thought - your example working set is slightly
larger than what the cache holds. Hence what you are describing is
a cache reclaim threshold effect: something you can avoid with
/proc/sys/vm/vfs_cache_pressure.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Li Wang <liwang@ubuntukylin.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Sage Weil <sage@inktank.com>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Subject: Re: [PATCH 0/5] VFS: Directory level cache cleaning
Date: Wed, 18 Dec 2013 09:05:03 +1100 [thread overview]
Message-ID: <20131217220503.GA20579@dastard> (raw)
In-Reply-To: <cover.1387205337.git.liwang@ubuntukylin.com>
On Mon, Dec 16, 2013 at 07:00:04AM -0800, Li Wang wrote:
> Currently, Linux only support file system wide VFS
> cache (dentry cache and page cache) cleaning through
> '/proc/sys/vm/drop_caches'. Sometimes this is less
> flexible. The applications may know exactly whether
> the metadata and data will be referenced or not in future,
> a desirable mechanism is to enable applications to
> reclaim the memory of unused cache entries at a finer
> granularity - directory level. This enables applications
> to keep hot metadata and data (to be referenced in the
> future) in the cache, and kick unused out to avoid
> cache thrashing. Another advantage is it is more flexible
> for debugging.
>
> This patch extend the 'drop_caches' interface to
> support directory level cache cleaning and has a complete
> backward compatibility. '{1,2,3}' keeps the same semantics
> as before. Besides, "{1,2,3}:DIRECTORY_PATH_NAME" is allowed
> to recursively clean the caches under DIRECTORY_PATH_NAME.
> For example, 'echo 1:/home/foo/jpg > /proc/sys/vm/drop_caches'
> will clean the page caches of the files inside 'home/foo/jpg'.
>
> It is easy to demonstrate the advantage of directory level
> cache cleaning. We use a virtual machine configured with
> an Intel(R) Xeon(R) 8-core CPU E5506 @ 2.13GHz, and with 1GB
> memory. Three directories named '1', '2' and '3' are created,
> with each containing 180000 a?? 280000 files. The test program
> opens all files in a directory and then tries the next directory.
> The order for accessing the directories is '1', '2', '3',
> '1'.
>
> The time on accessing '1' on the second time is measured
> with/without cache cleaning, under different file counts.
> With cache cleaning, we clean all cache entries of files
> in '2' before accessing the files in '3'. The results
> are as follows (in seconds),
This sounds like a highly contrived test case. There is no reason
why dentry cache access time would change going from 180k to 280k
files in 3 directories unless you're right at the memory pressure
balance point in terms of cache sizing.
> Note: by default, VFS will move those unreferenced inodes
> into a global LRU list rather than freeing them, for this
> experiment, we modified iput() to force to free inode as well,
> this behavior and related codes are left for further discussion,
> thus not reflected in this patch)
>
> Number of files: 180000 200000 220000 240000 260000
> Without cleaning: 2.165 6.977 10.032 11.571 13.443
> With cleaning: 1.949 1.906 2.336 2.918 3.651
>
> When the number of files is 180000 in each directory,
> the metadata cache is large enough to buffer all entries
> of three directories, so re-accessing '1' will hit in
> the cache, regardless of whether '2' cleaned up or not.
> As the number of files increases, the cache can now only
> buffer two+ directories. Accessing '3' will result in some
> entries of '1' to be evicted (due to LRU). When re-accessing '1',
> some entries need be reloaded from disk, which is time-consuming.
Ok, so exactly as I thought - your example working set is slightly
larger than what the cache holds. Hence what you are describing is
a cache reclaim threshold effect: something you can avoid with
/proc/sys/vm/vfs_cache_pressure.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Li Wang <liwang@ubuntukylin.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Sage Weil <sage@inktank.com>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Subject: Re: [PATCH 0/5] VFS: Directory level cache cleaning
Date: Wed, 18 Dec 2013 09:05:03 +1100 [thread overview]
Message-ID: <20131217220503.GA20579@dastard> (raw)
In-Reply-To: <cover.1387205337.git.liwang@ubuntukylin.com>
On Mon, Dec 16, 2013 at 07:00:04AM -0800, Li Wang wrote:
> Currently, Linux only support file system wide VFS
> cache (dentry cache and page cache) cleaning through
> '/proc/sys/vm/drop_caches'. Sometimes this is less
> flexible. The applications may know exactly whether
> the metadata and data will be referenced or not in future,
> a desirable mechanism is to enable applications to
> reclaim the memory of unused cache entries at a finer
> granularity - directory level. This enables applications
> to keep hot metadata and data (to be referenced in the
> future) in the cache, and kick unused out to avoid
> cache thrashing. Another advantage is it is more flexible
> for debugging.
>
> This patch extend the 'drop_caches' interface to
> support directory level cache cleaning and has a complete
> backward compatibility. '{1,2,3}' keeps the same semantics
> as before. Besides, "{1,2,3}:DIRECTORY_PATH_NAME" is allowed
> to recursively clean the caches under DIRECTORY_PATH_NAME.
> For example, 'echo 1:/home/foo/jpg > /proc/sys/vm/drop_caches'
> will clean the page caches of the files inside 'home/foo/jpg'.
>
> It is easy to demonstrate the advantage of directory level
> cache cleaning. We use a virtual machine configured with
> an Intel(R) Xeon(R) 8-core CPU E5506 @ 2.13GHz, and with 1GB
> memory. Three directories named '1', '2' and '3' are created,
> with each containing 180000 – 280000 files. The test program
> opens all files in a directory and then tries the next directory.
> The order for accessing the directories is '1', '2', '3',
> '1'.
>
> The time on accessing '1' on the second time is measured
> with/without cache cleaning, under different file counts.
> With cache cleaning, we clean all cache entries of files
> in '2' before accessing the files in '3'. The results
> are as follows (in seconds),
This sounds like a highly contrived test case. There is no reason
why dentry cache access time would change going from 180k to 280k
files in 3 directories unless you're right at the memory pressure
balance point in terms of cache sizing.
> Note: by default, VFS will move those unreferenced inodes
> into a global LRU list rather than freeing them, for this
> experiment, we modified iput() to force to free inode as well,
> this behavior and related codes are left for further discussion,
> thus not reflected in this patch)
>
> Number of files: 180000 200000 220000 240000 260000
> Without cleaning: 2.165 6.977 10.032 11.571 13.443
> With cleaning: 1.949 1.906 2.336 2.918 3.651
>
> When the number of files is 180000 in each directory,
> the metadata cache is large enough to buffer all entries
> of three directories, so re-accessing '1' will hit in
> the cache, regardless of whether '2' cleaned up or not.
> As the number of files increases, the cache can now only
> buffer two+ directories. Accessing '3' will result in some
> entries of '1' to be evicted (due to LRU). When re-accessing '1',
> some entries need be reloaded from disk, which is time-consuming.
Ok, so exactly as I thought - your example working set is slightly
larger than what the cache holds. Hence what you are describing is
a cache reclaim threshold effect: something you can avoid with
/proc/sys/vm/vfs_cache_pressure.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2013-12-17 22:05 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-16 15:00 [PATCH 0/5] VFS: Directory level cache cleaning Li Wang
2013-12-16 15:00 ` Li Wang
2013-12-16 15:00 ` Li Wang
2013-12-16 15:00 ` [PATCH 1/5] VFS: Convert drop_caches to accept string Li Wang
2013-12-16 15:00 ` Li Wang
2013-12-16 15:00 ` [PATCH 2/5] VFS: Convert sysctl_drop_caches to string Li Wang
2013-12-16 15:00 ` Li Wang
2013-12-16 15:00 ` [PATCH 3/5] VFS: Add the declaration of shrink_pagecache_parent Li Wang
2013-12-16 15:00 ` Li Wang
2013-12-16 15:00 ` [PATCH 4/5] VFS: Add shrink_pagecache_parent Li Wang
2013-12-16 15:00 ` Li Wang
2013-12-16 15:00 ` [PATCH 5/5] VFS: Extend drop_caches sysctl handler to allow directory level cache cleaning Li Wang
2013-12-16 15:00 ` Li Wang
2013-12-16 17:45 ` [PATCH 0/5] VFS: Directory " Cong Wang
2013-12-16 17:45 ` Cong Wang
2013-12-17 3:08 ` Li Wang
2013-12-17 3:08 ` Li Wang
2013-12-17 3:58 ` Matthew Wilcox
2013-12-17 7:23 ` Li Wang
2013-12-17 7:23 ` Li Wang
2013-12-17 9:12 ` Li Zefan
2013-12-17 9:12 ` Li Zefan
2013-12-17 9:12 ` Li Zefan
2013-12-17 9:31 ` Li Wang
2013-12-17 9:31 ` Li Wang
2013-12-18 1:26 ` Li Zefan
2013-12-18 1:26 ` Li Zefan
2013-12-18 1:26 ` Li Zefan
2013-12-17 13:55 ` Michal Hocko
2013-12-17 13:55 ` Michal Hocko
2013-12-17 22:05 ` Dave Chinner [this message]
2013-12-17 22:05 ` Dave Chinner
2013-12-17 22:05 ` Dave Chinner
2013-12-18 1:36 ` Li Wang
2013-12-18 1:36 ` Li Wang
2013-12-18 1:36 ` Li Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131217220503.GA20579@dastard \
--to=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liwang@ubuntukylin.com \
--cc=sage@inktank.com \
--cc=viro@zeniv.linux.org.uk \
--cc=yunchuanwen@ubuntukylin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.