From: Vasily Averin <vvs@sw.ru>
To: Neil Brown <neilb@suse.de>
Cc: David Howells <dhowells@redhat.com>, Kirill Korotaev <dev@sw.ru>,
devel@openvz.org, Andrew Morton <akpm@osdl.org>,
Kirill Korotaev <dev@openvz.org>,
Balbir Singh <balbir@in.ibm.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Olaf Hering <olh@suse.de>, Jan Blunck <jblunck@suse.de>
Subject: Re: [PATCH 2.6.19-rc3] VFS: per-sb dentry lru list
Date: Tue, 14 Nov 2006 09:12:45 +0300 [thread overview]
Message-ID: <45595E5D.5040309@sw.ru> (raw)
In-Reply-To: <17753.22478.650633.368773@cse.unsw.edu.au>
[-- Attachment #1: Type: text/plain, Size: 2380 bytes --]
Hello Neil
Neil Brown wrote:
> On Wednesday November 1, vvs@sw.ru wrote:
>> Currently we have 3 type of functions that works with dentry_unused list:
>>
>> 1) prune_dcache(NULL) -- called from shrink_dcache_memory, frees the memory and
>> requires global LRU. works well in current implementation.
>> 2) prune_dcache(sb) -- called from shrink_dcache_parent(), frees subtree, LRU
>> is not need here. Current implementation uses global LRU for these purposes, it
>> is ineffective, and patch from Neil Brown fixes this issue.
>> 3) shrink_dcache_sb() -- called when we need to free the unused dentries for
>> given super block. Current implementation is not effective too, and per-sb LRU
>> would be the best solution here. On the other hand patch from Neil Brown is much
>> better than current implementation.
>>
>> In general I think that we should approve Neil Brown's patch. We (I and Kirill
>> Korotaev) are ready to acknowledge it when the following remarks fill be fixed:
>
>
>> - it seems for me list_splice() is not required inside
>> prune_dcache(),
>
> Yes, the list should be empty when we finish so you are right.
>
>> - DCACHE_REFERENCED dentries should not be removed from private list to
>> dentry_unused list, this flag should be ignored if the private list is used,
>
> Agreed.
>
>> - count argument should be ignored in this case too, we want to free all the
>> dentries in private list,
>
> Agreed.
>
>> - when we shrink the whole super block we should free per-sb anonymous dentries
>> too (please see Kirill Korotaev's letter)
>>
>
> Yes. Unfortunately I don't think it is as easy as it sounds.
> I'll have a closer look.
>
>
>> Then I'm going to prepare new patch that will enhance the shrink_dcache_sb()
>> performance:
>> - we can add new list head into struct superblock and use it in
>> shrink_dcache_sb() instead of temporal private list. We will check is it empty
>> in dput() and add the new unused dentries to per-sb list instead of
>> dentry_unused list.
>
> I think that makes sense. It means that you end up doing less work in
> select_parent, because the work has already been done in dput.
>
> How is the patch going?
Please see patches in attach files:
first one is incremental for your patch,
second one is merged version.
It would be very interesting to know your opinion about these changes.
thank you,
Vasily Averin
[-- Attachment #2: diff-vfs-shrink-dcache-NBaddon-20061103 --]
[-- Type: text/plain, Size: 6385 bytes --]
--- linux-2.6.19-rc4/fs/dcache.c.vshud1 2006-11-02 14:23:03.000000000 +0300
+++ linux-2.6.19-rc4/fs/dcache.c 2006-11-03 14:57:58.000000000 +0300
@@ -171,8 +171,14 @@ repeat:
if (d_unhashed(dentry))
goto kill_it;
if (list_empty(&dentry->d_lru)) {
- dentry->d_flags |= DCACHE_REFERENCED;
- list_add(&dentry->d_lru, &dentry_unused);
+ struct list_head *list;
+
+ list = &dentry->d_sb->s_shrink_list;
+ if (list_empty(list)) {
+ list = &dentry_unused;
+ dentry->d_flags |= DCACHE_REFERENCED;
+ }
+ list_add(&dentry->d_lru, list);
dentry_stat.nr_unused++;
}
spin_unlock(&dentry->d_lock);
@@ -394,10 +400,6 @@ static void prune_one_dentry(struct dent
*
* This function may fail to free any resources if
* all the dentries are in use.
- *
- * Any dentries that were not removed due to the @count
- * limit will be splice on to the end of dentry_unused,
- * so they should already be founded in dentry_stat.nr_unused.
*/
static void prune_dcache(int count, struct list_head *list)
@@ -409,7 +411,7 @@ static void prune_dcache(int count, stru
/* use the dentry_unused list */
list = &dentry_unused;
- for (; count ; count--) {
+ for (; have_list || count-- ;) {
struct dentry *dentry;
struct list_head *tmp;
struct rw_semaphore *s_umount;
@@ -434,6 +436,17 @@ static void prune_dcache(int count, stru
spin_unlock(&dentry->d_lock);
continue;
}
+ /*
+ * If this dentry is for "my" filesystem, then I can prune it
+ * without taking the s_umount lock: either I already hold it
+ * (called from shrink_dcache_sb) or are called from some
+ * filesystem opernations and therefore cannot race with
+ * generic_shutdown_super().
+ */
+ if (have_list) {
+ prune_one_dentry(dentry);
+ continue;
+ }
/* If the dentry was recently referenced, don't free it. */
if (dentry->d_flags & DCACHE_REFERENCED) {
dentry->d_flags &= ~DCACHE_REFERENCED;
@@ -443,21 +456,6 @@ static void prune_dcache(int count, stru
continue;
}
/*
- * If the dentry is not DCACHED_REFERENCED, it is time
- * to remove it from the dcache, provided the super block is
- * NULL (which means we are trying to reclaim memory)
- * or this dentry belongs to the same super block that
- * we want to shrink.
- */
- /*
- * If this dentry is for "my" filesystem, then I can prune it
- * without taking the s_umount lock (I already hold it).
- */
- if (have_list) {
- prune_one_dentry(dentry);
- continue;
- }
- /*
* ...otherwise we need to be sure this filesystem isn't being
* unmounted, otherwise we could race with
* generic_shutdown_super(), and end up holding a reference to
@@ -483,27 +481,9 @@ static void prune_dcache(int count, stru
list_add(&dentry->d_lru, &dentry_unused);
dentry_stat.nr_unused++;
}
- /* split any remaining entries back onto dentry_unused */
- if (have_list)
- list_splice(list, dentry_unused.prev);
spin_unlock(&dcache_lock);
}
-/**
- * shrink_dcache_sb - shrink dcache for a superblock
- * @sb: superblock
- *
- * Shrink the dcache for the specified super block. This
- * is used to reduce the dcache presence of a file system
- * before re-mounting, and when invalidating the device
- * holding a file system.
- */
-
-void shrink_dcache_sb(struct super_block * sb)
-{
- shrink_dcache_parent(sb->s_root);
-}
-
/*
* destroy a single subtree of dentries for unmount
* - see the comments on shrink_dcache_for_umount() for a description of the
@@ -706,11 +686,10 @@ positive:
* drop the lock and return early due to latency
* constraints.
*/
-static int select_parent(struct dentry * parent, struct list_head *new)
+static void select_parent(struct dentry * parent, struct list_head *new)
{
struct dentry *this_parent = parent;
struct list_head *next;
- int found = 0;
spin_lock(&dcache_lock);
repeat:
@@ -732,7 +711,6 @@ resume:
if (!atomic_read(&dentry->d_count)) {
list_add_tail(&dentry->d_lru, new);
dentry_stat.nr_unused++;
- found++;
}
/*
@@ -740,7 +718,7 @@ resume:
* ensures forward progress). We'll be coming back to find
* the rest.
*/
- if (found && need_resched())
+ if (!list_empty(new) && need_resched())
goto out;
/*
@@ -761,7 +739,37 @@ resume:
}
out:
spin_unlock(&dcache_lock);
- return found;
+}
+
+/**
+ * select_anon - further prune the cache
+ * @sb: superblock
+ *
+ * Prune the dentries that are anonymous
+ */
+
+static void select_anon(struct super_block *sb)
+{
+ struct hlist_node *lp;
+ struct hlist_head *head = &sb->s_anon;
+
+ spin_lock(&dcache_lock);
+ hlist_for_each(lp, head) {
+ struct dentry *this = hlist_entry(lp, struct dentry, d_hash);
+ if (!list_empty(&this->d_lru)) {
+ dentry_stat.nr_unused--;
+ list_del_init(&this->d_lru);
+ }
+ /*
+ * move only zero ref count dentries to the end
+ * of list for prune_dcache
+ */
+ if (!atomic_read(&this->d_count)) {
+ list_add(&this->d_lru, &sb->s_shrink_list);
+ dentry_stat.nr_unused++;
+ }
+ }
+ spin_unlock(&dcache_lock);
}
/**
@@ -773,11 +781,38 @@ out:
void shrink_dcache_parent(struct dentry * parent)
{
- int found;
LIST_HEAD(list);
- while ((found = select_parent(parent, &list)) != 0)
- prune_dcache(found, &list);
+ for (;;) {
+ select_parent(parent, &list);
+ if (list_empty(&list))
+ break;
+ prune_dcache(0, &list);
+ }
+}
+
+/**
+ * shrink_dcache_sb - shrink dcache for a superblock
+ * @sb: superblock
+ *
+ * Shrink the dcache for the specified super block. This
+ * is used to reduce the dcache presence of a file system
+ * before re-mounting, and when invalidating the device
+ * holding a file system.
+ */
+
+void shrink_dcache_sb(struct super_block * sb)
+{
+ struct list_head *list;
+
+ list = &sb->s_shrink_list;
+ for (;;) {
+ select_parent(sb->s_root, list);
+ select_anon(sb);
+ if (list_empty(list))
+ break;
+ prune_dcache(0, list);
+ }
}
/*
--- linux-2.6.19-rc4/include/linux/fs.h.vshud1 2006-11-03 15:28:04.000000000 +0300
+++ linux-2.6.19-rc4/include/linux/fs.h 2006-11-03 14:52:16.000000000 +0300
@@ -941,6 +941,7 @@ struct super_block {
struct hlist_head s_anon; /* anonymous dentries for (nfs) exporting */
struct list_head s_files;
+ struct list_head s_shrink_list;
struct block_device *s_bdev;
struct list_head s_instances;
struct quota_info s_dquot; /* Diskquota specific options */
[-- Attachment #3: diff-vfs-shrink-dcache-20061103 --]
[-- Type: text/plain, Size: 8824 bytes --]
--- linux-2.6.19-rc4/fs/dcache.c.vdch 2006-11-03 14:22:54.000000000 +0300
+++ linux-2.6.19-rc4/fs/dcache.c 2006-11-03 14:57:58.000000000 +0300
@@ -171,8 +171,14 @@ repeat:
if (d_unhashed(dentry))
goto kill_it;
if (list_empty(&dentry->d_lru)) {
- dentry->d_flags |= DCACHE_REFERENCED;
- list_add(&dentry->d_lru, &dentry_unused);
+ struct list_head *list;
+
+ list = &dentry->d_sb->s_shrink_list;
+ if (list_empty(list)) {
+ list = &dentry_unused;
+ dentry->d_flags |= DCACHE_REFERENCED;
+ }
+ list_add(&dentry->d_lru, list);
dentry_stat.nr_unused++;
}
spin_unlock(&dentry->d_lock);
@@ -384,8 +390,8 @@ static void prune_one_dentry(struct dent
/**
* prune_dcache - shrink the dcache
* @count: number of entries to try and free
- * @sb: if given, ignore dentries for other superblocks
- * which are being unmounted.
+ * @list: If given, remove from this list instead of
+ * from dentry_unused.
*
* Shrink the dcache. This is done when we need
* more memory, or simply when we need to unmount
@@ -396,33 +402,27 @@ static void prune_one_dentry(struct dent
* all the dentries are in use.
*/
-static void prune_dcache(int count, struct super_block *sb)
+static void prune_dcache(int count, struct list_head *list)
{
+ int have_list = list != NULL;
+
spin_lock(&dcache_lock);
- for (; count ; count--) {
+ if (!have_list)
+ /* use the dentry_unused list */
+ list = &dentry_unused;
+
+ for (; have_list || count-- ;) {
struct dentry *dentry;
struct list_head *tmp;
struct rw_semaphore *s_umount;
cond_resched_lock(&dcache_lock);
- tmp = dentry_unused.prev;
- if (sb) {
- /* Try to find a dentry for this sb, but don't try
- * too hard, if they aren't near the tail they will
- * be moved down again soon
- */
- int skip = count;
- while (skip && tmp != &dentry_unused &&
- list_entry(tmp, struct dentry, d_lru)->d_sb != sb) {
- skip--;
- tmp = tmp->prev;
- }
- }
- if (tmp == &dentry_unused)
+ tmp = list->prev;
+ if (tmp == list)
break;
list_del_init(tmp);
- prefetch(dentry_unused.prev);
+ prefetch(list->prev);
dentry_stat.nr_unused--;
dentry = list_entry(tmp, struct dentry, d_lru);
@@ -436,6 +436,17 @@ static void prune_dcache(int count, stru
spin_unlock(&dentry->d_lock);
continue;
}
+ /*
+ * If this dentry is for "my" filesystem, then I can prune it
+ * without taking the s_umount lock: either I already hold it
+ * (called from shrink_dcache_sb) or are called from some
+ * filesystem opernations and therefore cannot race with
+ * generic_shutdown_super().
+ */
+ if (have_list) {
+ prune_one_dentry(dentry);
+ continue;
+ }
/* If the dentry was recently referenced, don't free it. */
if (dentry->d_flags & DCACHE_REFERENCED) {
dentry->d_flags &= ~DCACHE_REFERENCED;
@@ -445,21 +456,6 @@ static void prune_dcache(int count, stru
continue;
}
/*
- * If the dentry is not DCACHED_REFERENCED, it is time
- * to remove it from the dcache, provided the super block is
- * NULL (which means we are trying to reclaim memory)
- * or this dentry belongs to the same super block that
- * we want to shrink.
- */
- /*
- * If this dentry is for "my" filesystem, then I can prune it
- * without taking the s_umount lock (I already hold it).
- */
- if (sb && dentry->d_sb == sb) {
- prune_one_dentry(dentry);
- continue;
- }
- /*
* ...otherwise we need to be sure this filesystem isn't being
* unmounted, otherwise we could race with
* generic_shutdown_super(), and end up holding a reference to
@@ -489,67 +485,6 @@ static void prune_dcache(int count, stru
}
/*
- * Shrink the dcache for the specified super block.
- * This allows us to unmount a device without disturbing
- * the dcache for the other devices.
- *
- * This implementation makes just two traversals of the
- * unused list. On the first pass we move the selected
- * dentries to the most recent end, and on the second
- * pass we free them. The second pass must restart after
- * each dput(), but since the target dentries are all at
- * the end, it's really just a single traversal.
- */
-
-/**
- * shrink_dcache_sb - shrink dcache for a superblock
- * @sb: superblock
- *
- * Shrink the dcache for the specified super block. This
- * is used to free the dcache before unmounting a file
- * system
- */
-
-void shrink_dcache_sb(struct super_block * sb)
-{
- struct list_head *tmp, *next;
- struct dentry *dentry;
-
- /*
- * Pass one ... move the dentries for the specified
- * superblock to the most recent end of the unused list.
- */
- spin_lock(&dcache_lock);
- list_for_each_safe(tmp, next, &dentry_unused) {
- dentry = list_entry(tmp, struct dentry, d_lru);
- if (dentry->d_sb != sb)
- continue;
- list_move(tmp, &dentry_unused);
- }
-
- /*
- * Pass two ... free the dentries for this superblock.
- */
-repeat:
- list_for_each_safe(tmp, next, &dentry_unused) {
- dentry = list_entry(tmp, struct dentry, d_lru);
- if (dentry->d_sb != sb)
- continue;
- dentry_stat.nr_unused--;
- list_del_init(tmp);
- spin_lock(&dentry->d_lock);
- if (atomic_read(&dentry->d_count)) {
- spin_unlock(&dentry->d_lock);
- continue;
- }
- prune_one_dentry(dentry);
- cond_resched_lock(&dcache_lock);
- goto repeat;
- }
- spin_unlock(&dcache_lock);
-}
-
-/*
* destroy a single subtree of dentries for unmount
* - see the comments on shrink_dcache_for_umount() for a description of the
* locking
@@ -739,7 +674,7 @@ positive:
/*
* Search the dentry child list for the specified parent,
- * and move any unused dentries to the end of the unused
+ * and move any unused dentries to the end of a new unused
* list for prune_dcache(). We descend to the next level
* whenever the d_subdirs list is non-empty and continue
* searching.
@@ -751,11 +686,10 @@ positive:
* drop the lock and return early due to latency
* constraints.
*/
-static int select_parent(struct dentry * parent)
+static void select_parent(struct dentry * parent, struct list_head *new)
{
struct dentry *this_parent = parent;
struct list_head *next;
- int found = 0;
spin_lock(&dcache_lock);
repeat:
@@ -775,9 +709,8 @@ resume:
* of the unused list for prune_dcache
*/
if (!atomic_read(&dentry->d_count)) {
- list_add_tail(&dentry->d_lru, &dentry_unused);
+ list_add_tail(&dentry->d_lru, new);
dentry_stat.nr_unused++;
- found++;
}
/*
@@ -785,7 +718,7 @@ resume:
* ensures forward progress). We'll be coming back to find
* the rest.
*/
- if (found && need_resched())
+ if (!list_empty(new) && need_resched())
goto out;
/*
@@ -806,7 +739,37 @@ resume:
}
out:
spin_unlock(&dcache_lock);
- return found;
+}
+
+/**
+ * select_anon - further prune the cache
+ * @sb: superblock
+ *
+ * Prune the dentries that are anonymous
+ */
+
+static void select_anon(struct super_block *sb)
+{
+ struct hlist_node *lp;
+ struct hlist_head *head = &sb->s_anon;
+
+ spin_lock(&dcache_lock);
+ hlist_for_each(lp, head) {
+ struct dentry *this = hlist_entry(lp, struct dentry, d_hash);
+ if (!list_empty(&this->d_lru)) {
+ dentry_stat.nr_unused--;
+ list_del_init(&this->d_lru);
+ }
+ /*
+ * move only zero ref count dentries to the end
+ * of list for prune_dcache
+ */
+ if (!atomic_read(&this->d_count)) {
+ list_add(&this->d_lru, &sb->s_shrink_list);
+ dentry_stat.nr_unused++;
+ }
+ }
+ spin_unlock(&dcache_lock);
}
/**
@@ -818,10 +781,38 @@ out:
void shrink_dcache_parent(struct dentry * parent)
{
- int found;
+ LIST_HEAD(list);
- while ((found = select_parent(parent)) != 0)
- prune_dcache(found, parent->d_sb);
+ for (;;) {
+ select_parent(parent, &list);
+ if (list_empty(&list))
+ break;
+ prune_dcache(0, &list);
+ }
+}
+
+/**
+ * shrink_dcache_sb - shrink dcache for a superblock
+ * @sb: superblock
+ *
+ * Shrink the dcache for the specified super block. This
+ * is used to reduce the dcache presence of a file system
+ * before re-mounting, and when invalidating the device
+ * holding a file system.
+ */
+
+void shrink_dcache_sb(struct super_block * sb)
+{
+ struct list_head *list;
+
+ list = &sb->s_shrink_list;
+ for (;;) {
+ select_parent(sb->s_root, list);
+ select_anon(sb);
+ if (list_empty(list))
+ break;
+ prune_dcache(0, list);
+ }
}
/*
--- linux-2.6.19-rc4/include/linux/fs.h.vdch 2006-11-03 14:22:12.000000000 +0300
+++ linux-2.6.19-rc4/include/linux/fs.h 2006-11-03 14:52:16.000000000 +0300
@@ -941,6 +941,7 @@ struct super_block {
struct hlist_head s_anon; /* anonymous dentries for (nfs) exporting */
struct list_head s_files;
+ struct list_head s_shrink_list;
struct block_device *s_bdev;
struct list_head s_instances;
struct quota_info s_dquot; /* Diskquota specific options */
next prev parent reply other threads:[~2006-11-14 6:12 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-25 12:30 [Q] missing unused dentry in prune_dcache()? Vasily Averin
2006-10-25 13:51 ` David Howells
2006-10-25 13:58 ` Vasily Averin
2006-10-25 14:23 ` David Howells
2006-10-26 11:36 ` Vasily Averin
2006-10-26 13:25 ` David Howells
2006-10-27 8:05 ` Vasily Averin
2006-10-27 10:42 ` David Howells
2006-10-27 11:50 ` Vasily Averin
2006-10-27 12:11 ` David Howells
2006-10-27 13:47 ` Vasily Averin
2006-10-27 14:29 ` David Howells
2006-10-27 14:39 ` Kirill Korotaev
2006-10-27 14:05 ` [PATCH 2.6.19-rc3] VFS: per-sb dentry lru list Vasily Averin
2006-10-27 18:06 ` Andrew Morton
2006-10-30 14:24 ` Vasily Averin
2006-10-30 15:08 ` Eric Dumazet
2006-10-30 15:34 ` Kirill Korotaev
2006-10-30 16:09 ` Eric Dumazet
2006-10-30 15:14 ` Kirill Korotaev
2006-10-30 4:24 ` David Chinner
2006-10-30 6:28 ` [Devel] " Kirill Korotaev
2006-10-31 4:38 ` Neil Brown
2006-10-31 10:40 ` David Howells
2006-11-01 6:32 ` Neil Brown
2006-11-01 13:32 ` Vasily Averin
2006-11-14 5:44 ` Neil Brown
2006-11-14 6:12 ` Vasily Averin [this message]
2006-10-31 13:08 ` Vasily Averin
2006-11-01 10:55 ` [Devel] " Kirill Korotaev
2006-11-14 4:51 ` Neil Brown
2006-11-14 9:29 ` David Howells
2006-10-27 13:42 ` [PATCH 2.6.19-rc3] VFS: missing unused dentry in prune_dcache() Vasily Averin
2006-10-27 14:24 ` David Howells
2006-10-26 11:49 ` [Q] missing unused dentry in prune_dcache()? Vasily Averin
2006-10-26 12:33 ` David Howells
2006-10-26 13:23 ` David Howells
2006-10-26 13:58 ` Vasily Averin
2006-10-26 14:07 ` David Howells
2006-10-27 6:32 ` Vasily Averin
2006-10-27 6:50 ` Vasily Averin
2006-10-27 9:36 ` David Howells
2006-10-31 13:24 ` [Q] missing ->d_delete() in shrink_dcache_for_umount()? Vasily Averin
2006-10-31 15:06 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45595E5D.5040309@sw.ru \
--to=vvs@sw.ru \
--cc=akpm@osdl.org \
--cc=balbir@in.ibm.com \
--cc=dev@openvz.org \
--cc=dev@sw.ru \
--cc=devel@openvz.org \
--cc=dhowells@redhat.com \
--cc=jblunck@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=neilb@suse.de \
--cc=olh@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox