linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: RE: oops in 2.4.25 prune_icache() called from kswapd
@ 2005-06-19 22:51 Velupula, Prakash
  2005-06-19 23:07 ` Chris Caputo
  0 siblings, 1 reply; 6+ messages in thread
From: Velupula, Prakash @ 2005-06-19 22:51 UTC (permalink / raw)
  To: Marcelo Tosatti, Chris Caputo
  Cc: Albert Chu, David Woodhouse, Al Viro, linux-fsdevel, lwoodman

Hi Marcelo,

We are using 2.4.20 and would be able test it. But before that, we are
trying to recreate the problem. We just had one occurrence of this
problem. Do you have any inputs on reproducing this?

Thanks,
Prakash 

-----Original Message-----
From: Marcelo Tosatti [mailto:marcelo.tosatti@cyclades.com] 
Sent: Saturday, June 18, 2005 3:34 PM
To: Chris Caputo
Cc: Albert Chu; David Woodhouse; Al Viro; Velupula, Prakash;
linux-fsdevel@vger.kernel.org; lwoodman@redhat.com
Subject: Re: RE: oops in 2.4.25 prune_icache() called from kswapd

On Sun, Jun 19, 2005 at 01:11:20AM +0000, Chris Caputo wrote:
> Hi,
> 
> Marcello, your patch didn't come through.  Unfortunately I can't test 
> it anymore since my environment has now changed, but hopefully Albert 
> or Prakash can try it when you resend.

Albert, Prakash, any of you using stock v2.4? 

Here's the diff:

--- a/fs/inode.c.orig	2005-06-18 11:19:21.508857600 -0300
+++ b/fs/inode.c	2005-06-18 11:21:03.925287936 -0300
@@ -1238,15 +1238,18 @@
 				BUG();
 		} else {
 			if (!list_empty(&inode->i_hash)) {
-				if (!(inode->i_state &
(I_DIRTY|I_LOCK))) 
-					__refile_inode(inode);
-				inodes_stat.nr_unused++;
-				spin_unlock(&inode_lock);
-				if (!sb || (sb->s_flags & MS_ACTIVE))
+				if (!sb || (sb->s_flags & MS_ACTIVE)) {
+					if (!(inode->i_state & 
+					(I_DIRTY|I_LOCK))) {
+						__refile_inode(inode);
+						inodes_stat.nr_unused++;
+					}
+					spin_unlock(&inode_lock);
 					return;
+				}
+				spin_unlock(&inode_lock);
 				write_inode_now(inode, 1);
 				spin_lock(&inode_lock);
-				inodes_stat.nr_unused--;
 				list_del_init(&inode->i_hash);
 			}
 			list_del_init(&inode->i_list);
> 
> Thanks,
> Chris
> 
> On Sat, 18 Jun 2005, Marcelo Tosatti wrote:
> >Hi,
> >
> >Shame the RH bugzilla (#155289) requires super priviledges to be 
> >accessed.
> >
> >I've got around reading Albert's description of the race - thanks
BTW.
> >(its attached below for reference)
> >
> >It seems to me that only window open in mainline is between iput() 
> >and prune_icache(), while iput() sleeps on sync_one() with the inode
> >being:
> >
> >- on the unused list
> >- and with i_count set to zero
> >
> >prune_icache() is free to invalidate and destroy the inode in the 
> >meantime, causing iput()'s sync_one() to __refile_inode() the NULL 
> >entry to the unused list later on.
> >
> >If that is indeed the case, removing __refile_inode() from the 
> >nonzero
> >inode->i_nlink path should close that window.
> >
> >Chris, can you please test the attached "iput-iprune-race.patch" with

> >your usual irqbalance enable environment ?
> >
> >
> >On Sat, Jun 18, 2005 at 01:02:20PM -0700, Albert Chu wrote:
> >>Howdy everyone,
> >>
> >>>Albert Chu, CC'ed, has suggested the below as a fix.  Albert, any 
> >>>new info on this, or have these two patches cleared up the problem
well?
> >>
> >>The __refile_inode() patch below fixed the problem for us on our 
> >>clusters (running RHEL3).  The clear_inode() patch is something 
> >>Redhat (I think Larry Woodman) gave to us to fix a problem he 
> >>believes is in the same general area.  We haven't had any additional

> >>problems adding his second patch.
> >>
> >>>BTW, the below mail says that these are workarounds and a real fix 
> >>>is on the way. Has that been rolled in as well?
> >>
> >>Sorry, not too sure.  :-(
> >
> >Albert's description:
> >
> >>Just thought I'd let you know what's up.  We think we're close to 
> >>getting to the bottom of this.  We think the race is between iput() 
> >>and __refile_inode().  The Redhat kernel is of course different than

> >>the mainline kernel, but I think the same bug exists in the 
> >>mainline.  The example below illustrates the race between iput() and

> >>__sync_one(), but it could occur with other areas that all 
> >>__refile_inode().  For us, I think we're hitting it in __sync_one() 
> >>and prune_icache().  (The
> >>prune_icache() call to __refile_inode() doesn't seem to be in the 
> >>mainline though).
> >>
> >>proc 0:
> >>calls iput() and locks inode_lock.
> >>iput removes the inode off of the i_list unlocks inode_lock (proc 1 
> >>will now at some point grab inode_lock) calls clear_inode() (this is

> >>key) gets past the call to wait_on_inode(); at this point 
> >>clear_inode() and the remainder of iput() does not care about I_LOCK

> >>or inode_lock.
> >>
> >>proc 1:
> >>calls __sync_one() with inode_lock.
> >>sets I_LOCK
> >>do stuff before __refile_inode() is called, all I_LOCK/inode_lock 
> >>stuff doesn't matter.
> >>
> >>proc 0:
> >>sets i_state = I_CLEAR
> >>iput calls destroy_inode()
> >>
> >>proc 1:
> >>calls __refile_inode
> >>
> >>and we have ourselves a corrupted inode on the inode_unused list.
> >>
> >>I'm not sure if you can see it or not, but Redhat bugzilla 155289 is

> >>tracking this.
> >>
> >>Al

^ permalink raw reply	[flat|nested] 6+ messages in thread
[parent not found: <390de35bed.35bed390de@llnl.gov>]

end of thread, other threads:[~2005-07-26 23:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-19 22:51 RE: oops in 2.4.25 prune_icache() called from kswapd Velupula, Prakash
2005-06-19 23:07 ` Chris Caputo
2005-07-26 11:01   ` Marcelo Tosatti
     [not found] <390de35bed.35bed390de@llnl.gov>
2005-06-18 19:45 ` Marcelo Tosatti
2005-06-19  1:11   ` Chris Caputo
2005-06-18 20:33     ` Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).