From: Eric Dumazet <dada1@cosmosbay.com>
To: David Miller <davem@davemloft.net>, Al Viro <viro@ftp.linux.org.uk>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC, PATCH] dont insert sockets/pipes dentries into dentry_hashtable.
Date: Wed, 01 Nov 2006 14:54:56 +0100 [thread overview]
Message-ID: <4548A730.8060507@cosmosbay.com> (raw)
In-Reply-To: <20061031.231954.23010447.davem@davemloft.net>
[-- Attachment #1: Type: text/plain, Size: 2052 bytes --]
David Miller a écrit :
> It turns out that while procfs uses a different "mechanism", those
> procfs symlinks do point to the real socket dentry, so when you
> readlink() on it you do d_path() on the real socket dentry.
>
> Al Viro just suggested a way around this to me:
>
> 1) Just mark the dentry HASHED by hand in the dentry flags, but don't
> actually hash it.
>
> 2) Create a special dentry->d_deleted method for sockets that returns
> 0 and clears by hand the HASHED flag bit in the dentry (see what
> dput() does when this happens).
>
> It's an abuse but it will work.
>
Thank you David and Al for the feedback.
Here is a new version of the patch with your suggestions included.
If necessary, I could split this patch in 3 elementary patches. I chose to
sent it as one patch for ease of discussion.
[RFC, PATCH] dont insert sockets/pipes dentries into dentry_hashtable.
We currently insert sockets/pipes dentries into the global dentry hashtable.
This is *useless* because there is currently no way these entries can be used
for a lookup(). (/proc/xxx/fd/xxx uses a different mechanism)
Machines with a lot of sockets/pipes might suffer from longer chains in dentry
hashtable.
Since dentries an unhashed dentry means __dpath() adds a " (deleted)", the
trick for socket/pipe dentries is to :
- Right after d_alloc(), pretend they are hashed by clearing the
DCACHE_UNHASHED bit. __dpath() & friends work as intended.
- Call d_instantiate() instead of d_add() : dentry is not inserted in hash table.
- Once dput() must clear the dentry, setting again DCACHE_UNHASHED bit inside
the custom d_delete() function provided by socket/pipe code.
[patch 1/3] Small optimization to bypass RCU freeing in d_free() for dentries
that were never hashed (like sockets and pipes). Such dentries dont have to
wait a RCU grace period.
[patch 2/3] Change socket code to use d_instantiate() instead of d_add()
[patch 3/3] Change pipe code to use d_instantiate() instead of d_add()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
[-- Attachment #2: donthash.patch --]
[-- Type: text/plain, Size: 3861 bytes --]
--- linux-2.6.19-rc4/fs/dcache.c 2006-11-01 12:38:23.000000000 +0100
+++ linux-2.6.19-rc4-ed/fs/dcache.c 2006-11-01 13:22:44.000000000 +0100
@@ -68,15 +68,19 @@
.age_limit = 45,
};
-static void d_callback(struct rcu_head *head)
+static void __d_free(struct dentry *dentry)
{
- struct dentry * dentry = container_of(head, struct dentry, d_u.d_rcu);
-
if (dname_external(dentry))
kfree(dentry->d_name.name);
kmem_cache_free(dentry_cache, dentry);
}
+static void d_callback(struct rcu_head *head)
+{
+ struct dentry * dentry = container_of(head, struct dentry, d_u.d_rcu);
+ __d_free(dentry);
+}
+
/*
* no dcache_lock, please. The caller must decrement dentry_stat.nr_dentry
* inside dcache_lock.
@@ -85,7 +89,11 @@
{
if (dentry->d_op && dentry->d_op->d_release)
dentry->d_op->d_release(dentry);
- call_rcu(&dentry->d_u.d_rcu, d_callback);
+ /* if dentry was never inserted into hash, immediate free is OK */
+ if (dentry->d_hash.pprev == NULL)
+ __d_free(dentry);
+ else
+ call_rcu(&dentry->d_u.d_rcu, d_callback);
}
/*
--- linux-2.6.19-rc4/net/socket.c 2006-11-01 12:40:27.000000000 +0100
+++ linux-2.6.19-rc4-ed/net/socket.c 2006-11-01 14:33:18.000000000 +0100
@@ -306,7 +306,14 @@
static int sockfs_delete_dentry(struct dentry *dentry)
{
- return 1;
+ /*
+ * At creation time, we pretended this dentry was hashed
+ * (by clearing DCACHE_UNHASHED bit in d_flags)
+ * At delete time, we restore the truth : not hashed.
+ * (so that dput() can proceed correctly)
+ */
+ dentry->d_flags |= DCACHE_UNHASHED;
+ return 0;
}
static struct dentry_operations sockfs_dentry_operations = {
.d_delete = sockfs_delete_dentry,
@@ -354,14 +361,20 @@
this.len = sprintf(name, "[%lu]", SOCK_INODE(sock)->i_ino);
this.name = name;
- this.hash = SOCK_INODE(sock)->i_ino;
+ this.hash = 0;
file->f_dentry = d_alloc(sock_mnt->mnt_sb->s_root, &this);
if (unlikely(!file->f_dentry))
return -ENOMEM;
file->f_dentry->d_op = &sockfs_dentry_operations;
- d_add(file->f_dentry, SOCK_INODE(sock));
+ /*
+ * We dont want to push this dentry into global dentry hash table.
+ * We pretend dentry is already hashed, by unsetting DCACHE_UNHASHED
+ * This hack permits a working /proc/$pid/fd/XXX on sockets
+ */
+ file->f_dentry->d_flags &= ~DCACHE_UNHASHED;
+ d_instantiate(file->f_dentry, SOCK_INODE(sock));
file->f_vfsmnt = mntget(sock_mnt);
file->f_mapping = file->f_dentry->d_inode->i_mapping;
--- linux-2.6.19-rc4/fs/pipe.c 2006-11-01 12:56:05.000000000 +0100
+++ linux-2.6.19-rc4-ed/fs/pipe.c 2006-11-01 14:33:18.000000000 +0100
@@ -830,7 +830,14 @@
static struct vfsmount *pipe_mnt __read_mostly;
static int pipefs_delete_dentry(struct dentry *dentry)
{
- return 1;
+ /*
+ * At creation time, we pretended this dentry was hashed
+ * (by clearing DCACHE_UNHASHED bit in d_flags)
+ * At delete time, we restore the truth : not hashed.
+ * (so that dput() can proceed correctly)
+ */
+ dentry->d_flags |= DCACHE_UNHASHED;
+ return 0;
}
static struct dentry_operations pipefs_dentry_operations = {
@@ -891,17 +898,22 @@
if (!inode)
goto err_file;
- sprintf(name, "[%lu]", inode->i_ino);
+ this.len = sprintf(name, "[%lu]", inode->i_ino);
this.name = name;
- this.len = strlen(name);
- this.hash = inode->i_ino; /* will go */
+ this.hash = 0;
err = -ENOMEM;
dentry = d_alloc(pipe_mnt->mnt_sb->s_root, &this);
if (!dentry)
goto err_inode;
dentry->d_op = &pipefs_dentry_operations;
- d_add(dentry, inode);
+ /*
+ * We dont want to push this dentry into global dentry hash table.
+ * We pretend dentry is already hashed, by unsetting DCACHE_UNHASHED
+ * This hack permits a working /proc/$pid/fd/XXX on pipes
+ */
+ dentry->d_flags &= ~DCACHE_UNHASHED;
+ d_instantiate(dentry, inode);
f->f_vfsmnt = mntget(pipe_mnt);
f->f_dentry = dentry;
f->f_mapping = inode->i_mapping;
next prev parent reply other threads:[~2006-11-01 13:55 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-24 14:17 [PATCH] fix integer overflow in H-TCP congestion control Gavin McCullagh
2006-10-24 22:30 ` David Miller
2006-10-25 8:47 ` Gavin McCullagh
2006-10-26 6:06 ` David Miller
2006-10-31 18:48 ` [RFC, PATCH] dont insert sockets/pipes dentries into dentry_hashtable Eric Dumazet
2006-11-01 7:19 ` David Miller
2006-11-01 8:21 ` Eric Dumazet
2006-11-01 8:34 ` David Miller
2006-11-01 8:38 ` Al Viro
2006-11-01 8:42 ` Al Viro
2006-11-01 9:04 ` Eric Dumazet
2006-11-01 13:54 ` Eric Dumazet [this message]
2006-11-22 18:00 ` [PATCH] [NET] dont insert socket " Eric Dumazet
2006-11-28 23:35 ` David Miller
2006-11-29 0:13 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4548A730.8060507@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).