* [Ocfs2-devel] [Ocfs2-users] ocfs or configfs bug ? [not found] ` <20110419194840.GA27826@noexit> @ 2011-04-20 0:20 ` Sunil Mushran 2011-05-11 1:38 ` Sunil Mushran 0 siblings, 1 reply; 7+ messages in thread From: Sunil Mushran @ 2011-04-20 0:20 UTC (permalink / raw) To: ocfs2-devel On 04/19/2011 12:48 PM, Joel Becker wrote: > You're too late here. This is in the echo process (bash, > really). getdents() isn't happening. > The problem is almost certainly in configfs. It's a race > between setup and teardown of the virtual attribute files. If anyone > else has a cycle to look at it, great, otherwise I'll try to get to it > later this week. So we ran into it internally. This is what I wrote in the bug. /@ The matching code in configfs_readir() is:/ /@ name = configfs_get_name(next);/ /@ len = strlen(name);/ /@ if (next->s_dentry)/ /@ ino = next->s_dentry->d_inode->i_ino; <===/ /@ else/ /@ ino = iunique(configfs_sb, 2);/ /@ ./ /@ if (filldir(dirent, name, len, filp->f_pos, ino,/ /@ dt_type(next)) < 0)/ /@ return 0;/ /@ ./ /@ The oops indicates that next->s_dentry->d_inode is NULL./ Joel, does this give you any clues? BTW, thanks for the testcase. And yes, I can reproduce it easily. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20110419/6b6d3b34/attachment.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Ocfs2-devel] [Ocfs2-users] ocfs or configfs bug ? 2011-04-20 0:20 ` [Ocfs2-devel] [Ocfs2-users] ocfs or configfs bug ? Sunil Mushran @ 2011-05-11 1:38 ` Sunil Mushran 2011-05-18 10:49 ` Joel Becker 0 siblings, 1 reply; 7+ messages in thread From: Sunil Mushran @ 2011-05-11 1:38 UTC (permalink / raw) To: ocfs2-devel On 04/19/2011 05:20 PM, Sunil Mushran wrote: > On 04/19/2011 12:48 PM, Joel Becker wrote: >> You're too late here. This is in the echo process (bash, >> really). getdents() isn't happening. >> The problem is almost certainly in configfs. It's a race >> between setup and teardown of the virtual attribute files. If anyone >> else has a cycle to look at it, great, otherwise I'll try to get to it >> later this week. > > So we ran into it internally. This is what I wrote in the bug. > > /@ The matching code in configfs_readir() is:/ > /@ name = configfs_get_name(next);/ > /@ len = strlen(name);/ > /@ if (next->s_dentry)/ > /@ ino = next->s_dentry->d_inode->i_ino; <===/ > /@ else/ > /@ ino = iunique(configfs_sb, 2);/ > /@ ./ > /@ if (filldir(dirent, name, len, filp->f_pos, ino,/ > /@ dt_type(next)) < 0)/ > /@ return 0;/ > /@ ./ > /@ The oops indicates that next->s_dentry->d_inode is NULL./ > > Joel, does this give you any clues? > > BTW, thanks for the testcase. And yes, I can reproduce it easily. configfs_readdir() is racing configfs_d_delete(). And I cannot see how we can use the existing spinlock to prevent the race. sysfs uses a coarse lock sysfs_mutex to prevent against this. I think we should do the same in configfs. Comments? ============================================= commit 3007e997de91ec59af39a3f9c91595b31ae6e08b Author: Tejun Heo <htejun@gmail.com> Date: Thu Jun 14 04:27:23 2007 +0900 sysfs: use sysfs_mutex to protect the sysfs_dirent tree As kobj sysfs dentries and inodes are gonna be made reclaimable, i_mutex can't be used to protect sysfs_dirent tree. Use sysfs_mutex globally instead. As the whole tree is protected with sysfs_mutex, there is no reason to keep sysfs_rename_sem. Drop it. While at it, add docbook comments to functions which require sysfs_mutex locking. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> ============================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20110510/4219fe9b/attachment.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Ocfs2-devel] [Ocfs2-users] ocfs or configfs bug ? 2011-05-11 1:38 ` Sunil Mushran @ 2011-05-18 10:49 ` Joel Becker 2011-05-18 11:12 ` Joel Becker 0 siblings, 1 reply; 7+ messages in thread From: Joel Becker @ 2011-05-18 10:49 UTC (permalink / raw) To: ocfs2-devel On Tue, May 10, 2011 at 06:38:17PM -0700, Sunil Mushran wrote: > configfs_readdir() is racing configfs_d_delete(). And I cannot > see how we can use the existing spinlock to prevent the race. > > sysfs uses a coarse lock sysfs_mutex to prevent against this. > I think we should do the same in configfs. sysfs_mutex is peppered everywhere. It's rather ugly, I think. I am thinking about how the code reads. I admit I also don't like that sysfs_mutex serializes all accesses to the sysfs directory tree, but I'd be OK with that as a stopgap solution if the code only read better. Joel -- "The nice thing about egotists is that they don't talk about other people." - Lucille S. Harper http://www.jlbec.org/ jlbec at evilplan.org ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Ocfs2-devel] [Ocfs2-users] ocfs or configfs bug ? 2011-05-18 10:49 ` Joel Becker @ 2011-05-18 11:12 ` Joel Becker 2011-05-18 18:36 ` Sunil Mushran 0 siblings, 1 reply; 7+ messages in thread From: Joel Becker @ 2011-05-18 11:12 UTC (permalink / raw) To: ocfs2-devel On Wed, May 18, 2011 at 03:49:56AM -0700, Joel Becker wrote: > sysfs_mutex is peppered everywhere. It's rather ugly, I think. > I am thinking about how the code reads. I admit I also don't like that > sysfs_mutex serializes all accesses to the sysfs directory tree, but I'd > be OK with that as a stopgap solution if the code only read better. Can you test this? Joel From 24307aa1e707b31613be92deaba7990e16bc1aec Mon Sep 17 00:00:00 2001 From: Joel Becker <jlbec@evilplan.org> Date: Wed, 18 May 2011 04:08:16 -0700 Subject: [PATCH] configfs: Fix race between configfs_readdir() and configfs_d_iput() configfs_readdir() will use the existing inode numbers of inodes in the dcache, but it makes them up for attribute files that aren't currently instantiated. There is a race where a closing attribute file can be tearing down at the same time as configfs_readdir() is trying to get its inode number. We want to get the inode number of open attribute files, because they should match while instantiated. We can't lock down the transition where dentry->d_inode is set to NULL, so we just check for NULL there. We can, however, ensure that an inode we find isn't iput() in configfs_d_iput() until after we've accessed it. Signed-off-by: Joel Becker <jlbec@evilplan.org> --- fs/configfs/dir.c | 33 ++++++++++++++++++++++++++++----- 1 files changed, 28 insertions(+), 5 deletions(-) diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c index b11d734..9a37a9b 100644 --- a/fs/configfs/dir.c +++ b/fs/configfs/dir.c @@ -53,11 +53,14 @@ DEFINE_SPINLOCK(configfs_dirent_lock); static void configfs_d_iput(struct dentry * dentry, struct inode * inode) { - struct configfs_dirent * sd = dentry->d_fsdata; + struct configfs_dirent *sd = dentry->d_fsdata; if (sd) { BUG_ON(sd->s_dentry != dentry); + /* Coordinate with configfs_readdir */ + spin_lock(&configfs_dirent_lock); sd->s_dentry = NULL; + spin_unlock(&configfs_dirent_lock); configfs_put(sd); } iput(inode); @@ -1546,7 +1549,7 @@ static int configfs_readdir(struct file * filp, void * dirent, filldir_t filldir struct configfs_dirent * parent_sd = dentry->d_fsdata; struct configfs_dirent *cursor = filp->private_data; struct list_head *p, *q = &cursor->s_sibling; - ino_t ino; + ino_t ino = 0; int i = filp->f_pos; switch (i) { @@ -1574,6 +1577,7 @@ static int configfs_readdir(struct file * filp, void * dirent, filldir_t filldir struct configfs_dirent *next; const char * name; int len; + struct inode *inode = NULL; next = list_entry(p, struct configfs_dirent, s_sibling); @@ -1582,9 +1586,28 @@ static int configfs_readdir(struct file * filp, void * dirent, filldir_t filldir name = configfs_get_name(next); len = strlen(name); - if (next->s_dentry) - ino = next->s_dentry->d_inode->i_ino; - else + + /* + * We'll have a dentry and an inode for + * PINNED items and for open attribute + * files. We lock here to prevent a race + * with configfs_d_iput() clearing + * s_dentry before calling iput(). + * + * Why do we go to the trouble? If + * someone has an attribute file open, + * the inode number should match until + * they close it. Beyond that, we don't + * care. + */ + spin_lock(&configfs_dirent_lock); + dentry = next->s_dentry; + if (dentry) + inode = dentry->d_inode; + if (inode) + ino = inode->i_ino; + spin_unlock(&configfs_dirent_lock); + if (!inode) ino = iunique(configfs_sb, 2); if (filldir(dirent, name, len, filp->f_pos, ino, -- 1.7.3.1 -- "But all my words come back to me In shades of mediocrity. Like emptiness in harmony I need someone to comfort me." http://www.jlbec.org/ jlbec at evilplan.org ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Ocfs2-devel] [Ocfs2-users] ocfs or configfs bug ? 2011-05-18 11:12 ` Joel Becker @ 2011-05-18 18:36 ` Sunil Mushran 2011-05-18 19:11 ` Joel Becker 0 siblings, 1 reply; 7+ messages in thread From: Sunil Mushran @ 2011-05-18 18:36 UTC (permalink / raw) To: ocfs2-devel On 05/18/2011 04:12 AM, Joel Becker wrote: > On Wed, May 18, 2011 at 03:49:56AM -0700, Joel Becker wrote: >> sysfs_mutex is peppered everywhere. It's rather ugly, I think. >> I am thinking about how the code reads. I admit I also don't like that >> sysfs_mutex serializes all accesses to the sysfs directory tree, but I'd >> be OK with that as a stopgap solution if the code only read better. > Can you test this? > > Joel So it did run for far longer. But then hit the following. http://oss.oracle.com/~smushran/configfs_oops_20110518.txt BTW, this is not new. I was running into it earlier too. Albeit less frequently than readdir. > From 24307aa1e707b31613be92deaba7990e16bc1aec Mon Sep 17 00:00:00 2001 > From: Joel Becker<jlbec@evilplan.org> > Date: Wed, 18 May 2011 04:08:16 -0700 > Subject: [PATCH] configfs: Fix race between configfs_readdir() and configfs_d_iput() > > configfs_readdir() will use the existing inode numbers of inodes in the > dcache, but it makes them up for attribute files that aren't currently > instantiated. There is a race where a closing attribute file can be > tearing down at the same time as configfs_readdir() is trying to get its > inode number. > > We want to get the inode number of open attribute files, because they > should match while instantiated. We can't lock down the transition > where dentry->d_inode is set to NULL, so we just check for NULL there. > We can, however, ensure that an inode we find isn't iput() in > configfs_d_iput() until after we've accessed it. > > Signed-off-by: Joel Becker<jlbec@evilplan.org> > --- > fs/configfs/dir.c | 33 ++++++++++++++++++++++++++++----- > 1 files changed, 28 insertions(+), 5 deletions(-) > > diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c > index b11d734..9a37a9b 100644 > --- a/fs/configfs/dir.c > +++ b/fs/configfs/dir.c > @@ -53,11 +53,14 @@ DEFINE_SPINLOCK(configfs_dirent_lock); > static void configfs_d_iput(struct dentry * dentry, > struct inode * inode) > { > - struct configfs_dirent * sd = dentry->d_fsdata; > + struct configfs_dirent *sd = dentry->d_fsdata; > > if (sd) { > BUG_ON(sd->s_dentry != dentry); > + /* Coordinate with configfs_readdir */ > + spin_lock(&configfs_dirent_lock); > sd->s_dentry = NULL; > + spin_unlock(&configfs_dirent_lock); > configfs_put(sd); > } > iput(inode); > @@ -1546,7 +1549,7 @@ static int configfs_readdir(struct file * filp, void * dirent, filldir_t filldir > struct configfs_dirent * parent_sd = dentry->d_fsdata; > struct configfs_dirent *cursor = filp->private_data; > struct list_head *p, *q =&cursor->s_sibling; > - ino_t ino; > + ino_t ino = 0; > int i = filp->f_pos; > > switch (i) { > @@ -1574,6 +1577,7 @@ static int configfs_readdir(struct file * filp, void * dirent, filldir_t filldir > struct configfs_dirent *next; > const char * name; > int len; > + struct inode *inode = NULL; > > next = list_entry(p, struct configfs_dirent, > s_sibling); > @@ -1582,9 +1586,28 @@ static int configfs_readdir(struct file * filp, void * dirent, filldir_t filldir > > name = configfs_get_name(next); > len = strlen(name); > - if (next->s_dentry) > - ino = next->s_dentry->d_inode->i_ino; > - else > + > + /* > + * We'll have a dentry and an inode for > + * PINNED items and for open attribute > + * files. We lock here to prevent a race > + * with configfs_d_iput() clearing > + * s_dentry before calling iput(). > + * > + * Why do we go to the trouble? If > + * someone has an attribute file open, > + * the inode number should match until > + * they close it. Beyond that, we don't > + * care. > + */ > + spin_lock(&configfs_dirent_lock); > + dentry = next->s_dentry; > + if (dentry) > + inode = dentry->d_inode; > + if (inode) > + ino = inode->i_ino; > + spin_unlock(&configfs_dirent_lock); > + if (!inode) > ino = iunique(configfs_sb, 2); > > if (filldir(dirent, name, len, filp->f_pos, ino, ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Ocfs2-devel] [Ocfs2-users] ocfs or configfs bug ? 2011-05-18 18:36 ` Sunil Mushran @ 2011-05-18 19:11 ` Joel Becker 2011-05-18 20:08 ` Sunil Mushran 0 siblings, 1 reply; 7+ messages in thread From: Joel Becker @ 2011-05-18 19:11 UTC (permalink / raw) To: ocfs2-devel Oh, over here. On Wed, May 18, 2011 at 11:36:59AM -0700, Sunil Mushran wrote: > On 05/18/2011 04:12 AM, Joel Becker wrote: > > On Wed, May 18, 2011 at 03:49:56AM -0700, Joel Becker wrote: > >> sysfs_mutex is peppered everywhere. It's rather ugly, I think. > >> I am thinking about how the code reads. I admit I also don't like that > >> sysfs_mutex serializes all accesses to the sysfs directory tree, but I'd > >> be OK with that as a stopgap solution if the code only read better. > > Can you test this? > > > > Joel > > > So it did run for far longer. But then hit the following. > > http://oss.oracle.com/~smushran/configfs_oops_20110518.txt I got the same result. > BTW, this is not new. I was running into it earlier too. Albeit > less frequently than readdir. Do you think this is related, or does my patch fix readdir and go upstream with another fix coming for this BUG_ON()? Joel -- "The trouble with being punctual is that nobody's there to appreciate it." - Franklin P. Jones http://www.jlbec.org/ jlbec at evilplan.org ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Ocfs2-devel] [Ocfs2-users] ocfs or configfs bug ? 2011-05-18 19:11 ` Joel Becker @ 2011-05-18 20:08 ` Sunil Mushran 0 siblings, 0 replies; 7+ messages in thread From: Sunil Mushran @ 2011-05-18 20:08 UTC (permalink / raw) To: ocfs2-devel On 05/18/2011 12:11 PM, Joel Becker wrote: > Oh, over here. > > On Wed, May 18, 2011 at 11:36:59AM -0700, Sunil Mushran wrote: >> On 05/18/2011 04:12 AM, Joel Becker wrote: >>> On Wed, May 18, 2011 at 03:49:56AM -0700, Joel Becker wrote: >>>> sysfs_mutex is peppered everywhere. It's rather ugly, I think. >>>> I am thinking about how the code reads. I admit I also don't like that >>>> sysfs_mutex serializes all accesses to the sysfs directory tree, but I'd >>>> be OK with that as a stopgap solution if the code only read better. >>> Can you test this? >>> >>> Joel >> >> So it did run for far longer. But then hit the following. >> >> http://oss.oracle.com/~smushran/configfs_oops_20110518.txt > I got the same result. > >> BTW, this is not new. I was running into it earlier too. Albeit >> less frequently than readdir. > Do you think this is related, or does my patch fix readdir and > go upstream with another fix coming for this BUG_ON()? It does not look directly related. Though it too points to a hole in the locking. A second patch for this BUG_ON will work. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-05-18 20:08 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <4DADA228.5050209@bull.net>
[not found] ` <20110419194840.GA27826@noexit>
2011-04-20 0:20 ` [Ocfs2-devel] [Ocfs2-users] ocfs or configfs bug ? Sunil Mushran
2011-05-11 1:38 ` Sunil Mushran
2011-05-18 10:49 ` Joel Becker
2011-05-18 11:12 ` Joel Becker
2011-05-18 18:36 ` Sunil Mushran
2011-05-18 19:11 ` Joel Becker
2011-05-18 20:08 ` Sunil Mushran
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.