* [GIT PATCH] sysfs fixes for 2.6.22-rc4
@ 2007-06-12 23:21 Greg KH
2007-06-12 23:22 ` [PATCH 1/3] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses Greg Kroah-Hartman
2007-06-13 3:24 ` [GIT PATCH] sysfs fixes for 2.6.22-rc4 Eric Sandeen
0 siblings, 2 replies; 5+ messages in thread
From: Greg KH @ 2007-06-12 23:21 UTC (permalink / raw)
To: Linus Torvalds, Andrew Morton; +Cc: linux-kernel, htejun, sandeen
Here are some sysfs fixes for 2.6.22-rc4
They are based on a set of patches from Tejun that have been in the -mm
tree for a while and fix a nasty sysfs problem that people have been
hitting in the real world (Google has hit this a lot and spent a lot of
time and effort in tracking this down, I'd really like to say thanks for
the help here.)
Tejun and Eric have backported a small set of patches that fix this for
your current tree, with the larger, more intrusive patches queued up for
after 2.6.22 is out.
Tejun and I have beat on these patches a lot and have not found any
problems. I know it's late in the series for them, but under the
circumstances, it seems reasonable.
Please pull from:
master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6.git/
Patches will be sent as a follow-on to this message to lkml for people
to see.
thanks,
greg k-h
fs/sysfs/dir.c | 38 +++++++++++++++++++++++++++++++-------
fs/sysfs/inode.c | 21 +++++++++++++++++++--
fs/sysfs/mount.c | 1 +
fs/sysfs/sysfs.h | 2 ++
4 files changed, 53 insertions(+), 9 deletions(-)
---------------
Eric Sandeen (1):
sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses
Tejun Heo (2):
sysfs: fix condition check in sysfs_drop_dentry()
sysfs: fix race condition around sd->s_dentry, take#2
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/3] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses
2007-06-12 23:21 [GIT PATCH] sysfs fixes for 2.6.22-rc4 Greg KH
@ 2007-06-12 23:22 ` Greg Kroah-Hartman
2007-06-12 23:22 ` [PATCH 2/3] sysfs: fix condition check in sysfs_drop_dentry() Greg Kroah-Hartman
2007-06-13 3:24 ` [GIT PATCH] sysfs fixes for 2.6.22-rc4 Eric Sandeen
1 sibling, 1 reply; 5+ messages in thread
From: Greg Kroah-Hartman @ 2007-06-12 23:22 UTC (permalink / raw)
To: linux-kernel; +Cc: Eric Sandeen, Eric Sandeen, Tejun Heo, Greg Kroah-Hartman
From: Eric Sandeen <sandeen@sandeen.net>
Backport of
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch
For regular files in sysfs, sysfs_readdir wants to traverse
sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number.
But, the dentry can be reclaimed under memory pressure, and there is
no synchronization with readdir. This patch follows Tejun's scheme of
allocating and storing an inode number in the new s_ino member of a
sysfs_dirent, when dirents are created, and retrieving it from there
for readdir, so that the pointer chain doesn't have to be traversed.
Tejun's upstream patch uses a new-ish "ida" allocator which brings
along some extra complexity; this -stable patch has a brain-dead
incrementing counter which does not guarantee uniqueness, but because
sysfs doesn't hash inodes as iunique expects, uniqueness wasn't
guaranteed today anyway.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
fs/sysfs/dir.c | 16 +++++++++++-----
fs/sysfs/inode.c | 1 +
fs/sysfs/mount.c | 1 +
fs/sysfs/sysfs.h | 1 +
4 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 85a6686..17a8191 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -30,6 +30,14 @@ static struct dentry_operations sysfs_dentry_ops = {
.d_iput = sysfs_d_iput,
};
+static unsigned int sysfs_inode_counter;
+ino_t sysfs_get_inum(void)
+{
+ if (unlikely(sysfs_inode_counter < 3))
+ sysfs_inode_counter = 3;
+ return sysfs_inode_counter++;
+}
+
/*
* Allocates a new sysfs_dirent and links it to the parent sysfs_dirent
*/
@@ -41,6 +49,7 @@ static struct sysfs_dirent * __sysfs_new_dirent(void * element)
if (!sd)
return NULL;
+ sd->s_ino = sysfs_get_inum();
atomic_set(&sd->s_count, 1);
atomic_set(&sd->s_event, 1);
INIT_LIST_HEAD(&sd->s_children);
@@ -509,7 +518,7 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
switch (i) {
case 0:
- ino = dentry->d_inode->i_ino;
+ ino = parent_sd->s_ino;
if (filldir(dirent, ".", 1, i, ino, DT_DIR) < 0)
break;
filp->f_pos++;
@@ -538,10 +547,7 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
name = sysfs_get_name(next);
len = strlen(name);
- if (next->s_dentry)
- ino = next->s_dentry->d_inode->i_ino;
- else
- ino = iunique(sysfs_sb, 2);
+ ino = next->s_ino;
if (filldir(dirent, name, len, filp->f_pos, ino,
dt_type(next)) < 0)
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index bdd30e7..082e2d4 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -141,6 +141,7 @@ struct inode * sysfs_new_inode(mode_t mode, struct sysfs_dirent * sd)
inode->i_mapping->a_ops = &sysfs_aops;
inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
inode->i_op = &sysfs_inode_operations;
+ inode->i_ino = sd->s_ino;
lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key);
if (sd->s_iattr) {
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 23a48a3..00ab912 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -33,6 +33,7 @@ static struct sysfs_dirent sysfs_root = {
.s_element = NULL,
.s_type = SYSFS_ROOT,
.s_iattr = NULL,
+ .s_ino = 1,
};
static void sysfs_clear_inode(struct inode *inode)
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index a77c57e..1966e1a 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -5,6 +5,7 @@ struct sysfs_dirent {
void * s_element;
int s_type;
umode_t s_mode;
+ ino_t s_ino;
struct dentry * s_dentry;
struct iattr * s_iattr;
atomic_t s_event;
--
1.5.2.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/3] sysfs: fix condition check in sysfs_drop_dentry()
2007-06-12 23:22 ` [PATCH 1/3] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses Greg Kroah-Hartman
@ 2007-06-12 23:22 ` Greg Kroah-Hartman
2007-06-12 23:22 ` [PATCH 3/3] sysfs: fix race condition around sd->s_dentry, take#2 Greg Kroah-Hartman
0 siblings, 1 reply; 5+ messages in thread
From: Greg Kroah-Hartman @ 2007-06-12 23:22 UTC (permalink / raw)
To: linux-kernel; +Cc: Tejun Heo, Greg Kroah-Hartman
From: Tejun Heo <htejun@gmail.com>
The condition check doesn't make much sense as it basically always
succeeds. This causes NULL dereferencing on certain cases. It seems
that parentheses are put in the wrong place. Fix it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
fs/sysfs/inode.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 082e2d4..38bbe07 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -252,7 +252,7 @@ void sysfs_drop_dentry(struct sysfs_dirent * sd, struct dentry * parent)
if (dentry) {
spin_lock(&dcache_lock);
spin_lock(&dentry->d_lock);
- if (!(d_unhashed(dentry) && dentry->d_inode)) {
+ if (!d_unhashed(dentry) && dentry->d_inode) {
inode = dentry->d_inode;
spin_lock(&inode->i_lock);
__iget(inode);
--
1.5.2.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 3/3] sysfs: fix race condition around sd->s_dentry, take#2
2007-06-12 23:22 ` [PATCH 2/3] sysfs: fix condition check in sysfs_drop_dentry() Greg Kroah-Hartman
@ 2007-06-12 23:22 ` Greg Kroah-Hartman
0 siblings, 0 replies; 5+ messages in thread
From: Greg Kroah-Hartman @ 2007-06-12 23:22 UTC (permalink / raw)
To: linux-kernel; +Cc: Tejun Heo, Greg Kroah-Hartman
From: Tejun Heo <htejun@gmail.com>
Allowing attribute and symlink dentries to be reclaimed means
sd->s_dentry can change dynamically. However, updates to the field
are unsynchronized leading to race conditions. This patch adds
sysfs_lock and use it to synchronize updates to sd->s_dentry.
Due to the locking around ->d_iput, the check in sysfs_drop_dentry()
is complex. sysfs_lock only protect sd->s_dentry pointer itself. The
validity of the dentry is protected by dcache_lock, so whether dentry
is alive or not can only be tested while holding both locks.
This is minimal backport of sysfs_drop_dentry() rewrite in devel
branch.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
fs/sysfs/dir.c | 22 ++++++++++++++++++++--
fs/sysfs/inode.c | 18 +++++++++++++++++-
fs/sysfs/sysfs.h | 1 +
3 files changed, 38 insertions(+), 3 deletions(-)
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 17a8191..c4342a0 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -13,14 +13,26 @@
#include "sysfs.h"
DECLARE_RWSEM(sysfs_rename_sem);
+spinlock_t sysfs_lock = SPIN_LOCK_UNLOCKED;
static void sysfs_d_iput(struct dentry * dentry, struct inode * inode)
{
struct sysfs_dirent * sd = dentry->d_fsdata;
if (sd) {
- BUG_ON(sd->s_dentry != dentry);
- sd->s_dentry = NULL;
+ /* sd->s_dentry is protected with sysfs_lock. This
+ * allows sysfs_drop_dentry() to dereference it.
+ */
+ spin_lock(&sysfs_lock);
+
+ /* The dentry might have been deleted or another
+ * lookup could have happened updating sd->s_dentry to
+ * point the new dentry. Ignore if it isn't pointing
+ * to this dentry.
+ */
+ if (sd->s_dentry == dentry)
+ sd->s_dentry = NULL;
+ spin_unlock(&sysfs_lock);
sysfs_put(sd);
}
iput(inode);
@@ -247,7 +259,10 @@ static int sysfs_attach_attr(struct sysfs_dirent * sd, struct dentry * dentry)
}
dentry->d_fsdata = sysfs_get(sd);
+ /* protect sd->s_dentry against sysfs_d_iput */
+ spin_lock(&sysfs_lock);
sd->s_dentry = dentry;
+ spin_unlock(&sysfs_lock);
error = sysfs_create(dentry, (attr->mode & S_IALLUGO) | S_IFREG, init);
if (error) {
sysfs_put(sd);
@@ -269,7 +284,10 @@ static int sysfs_attach_link(struct sysfs_dirent * sd, struct dentry * dentry)
int err = 0;
dentry->d_fsdata = sysfs_get(sd);
+ /* protect sd->s_dentry against sysfs_d_iput */
+ spin_lock(&sysfs_lock);
sd->s_dentry = dentry;
+ spin_unlock(&sysfs_lock);
err = sysfs_create(dentry, S_IFLNK|S_IRWXUGO, init_symlink);
if (!err) {
dentry->d_op = &sysfs_dentry_ops;
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 38bbe07..5266eec 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -246,9 +246,23 @@ static inline void orphan_all_buffers(struct inode *node)
*/
void sysfs_drop_dentry(struct sysfs_dirent * sd, struct dentry * parent)
{
- struct dentry * dentry = sd->s_dentry;
+ struct dentry *dentry = NULL;
struct inode *inode;
+ /* We're not holding a reference to ->s_dentry dentry but the
+ * field will stay valid as long as sysfs_lock is held.
+ */
+ spin_lock(&sysfs_lock);
+ spin_lock(&dcache_lock);
+
+ /* dget dentry if it's still alive */
+ if (sd->s_dentry && sd->s_dentry->d_inode)
+ dentry = dget_locked(sd->s_dentry);
+
+ spin_unlock(&dcache_lock);
+ spin_unlock(&sysfs_lock);
+
+ /* drop dentry */
if (dentry) {
spin_lock(&dcache_lock);
spin_lock(&dentry->d_lock);
@@ -268,6 +282,8 @@ void sysfs_drop_dentry(struct sysfs_dirent * sd, struct dentry * parent)
spin_unlock(&dentry->d_lock);
spin_unlock(&dcache_lock);
}
+
+ dput(dentry);
}
}
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 1966e1a..502c949 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -33,6 +33,7 @@ extern const unsigned char * sysfs_get_name(struct sysfs_dirent *sd);
extern void sysfs_drop_dentry(struct sysfs_dirent *sd, struct dentry *parent);
extern int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
+extern spinlock_t sysfs_lock;
extern struct rw_semaphore sysfs_rename_sem;
extern struct super_block * sysfs_sb;
extern const struct file_operations sysfs_dir_operations;
--
1.5.2.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [GIT PATCH] sysfs fixes for 2.6.22-rc4
2007-06-12 23:21 [GIT PATCH] sysfs fixes for 2.6.22-rc4 Greg KH
2007-06-12 23:22 ` [PATCH 1/3] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses Greg Kroah-Hartman
@ 2007-06-13 3:24 ` Eric Sandeen
1 sibling, 0 replies; 5+ messages in thread
From: Eric Sandeen @ 2007-06-13 3:24 UTC (permalink / raw)
To: Greg KH; +Cc: Linus Torvalds, Andrew Morton, linux-kernel, htejun
Greg KH wrote:
> Eric Sandeen (1):
> sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses
To be fair, Tejun wrote that one too, I just backported & tried to
simplify it a bit for -stable. :)
> Tejun Heo (2):
> sysfs: fix condition check in sysfs_drop_dentry()
> sysfs: fix race condition around sd->s_dentry, take#2
>
Glad to see these are making their way out, we've had people hit these
bugs too. And they beat on this basic patchset quite a lot, with good
results.
Thanks!
-Eric
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-06-13 3:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-12 23:21 [GIT PATCH] sysfs fixes for 2.6.22-rc4 Greg KH
2007-06-12 23:22 ` [PATCH 1/3] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses Greg Kroah-Hartman
2007-06-12 23:22 ` [PATCH 2/3] sysfs: fix condition check in sysfs_drop_dentry() Greg Kroah-Hartman
2007-06-12 23:22 ` [PATCH 3/3] sysfs: fix race condition around sd->s_dentry, take#2 Greg Kroah-Hartman
2007-06-13 3:24 ` [GIT PATCH] sysfs fixes for 2.6.22-rc4 Eric Sandeen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.