* Re: RE: oops in 2.4.25 prune_icache() called from kswapd
[not found] <390de35bed.35bed390de@llnl.gov>
@ 2005-06-18 19:45 ` Marcelo Tosatti
2005-06-19 1:11 ` Chris Caputo
0 siblings, 1 reply; 6+ messages in thread
From: Marcelo Tosatti @ 2005-06-18 19:45 UTC (permalink / raw)
To: Albert Chu, David Woodhouse, Al Viro
Cc: Velupula, Prakash, Chris Caputo, linux-fsdevel, lwoodman
Hi,
Shame the RH bugzilla (#155289) requires super priviledges to be
accessed.
I've got around reading Albert's description of the race - thanks BTW.
(its attached below for reference)
It seems to me that only window open in mainline is between iput() and
prune_icache(), while iput() sleeps on sync_one() with the inode
being:
- on the unused list
- and with i_count set to zero
prune_icache() is free to invalidate and destroy the inode in the meantime,
causing iput()'s sync_one() to __refile_inode() the NULL entry to the
unused list later on.
If that is indeed the case, removing __refile_inode() from the nonzero
inode->i_nlink path should close that window.
Chris, can you please test the attached "iput-iprune-race.patch" with
your usual irqbalance enable environment ?
On Sat, Jun 18, 2005 at 01:02:20PM -0700, Albert Chu wrote:
> Howdy everyone,
>
> > Albert Chu, CC'ed, has suggested the below as a fix. Albert, any new
> > info on this, or have these two patches cleared up the problem well?
>
> The __refile_inode() patch below fixed the problem for us on our
> clusters (running RHEL3). The clear_inode() patch is something Redhat
> (I think Larry Woodman) gave to us to fix a problem he believes is in
> the same general area. We haven't had any additional problems adding
> his second patch.
>
> > BTW, the below mail says that these are workarounds and a real fix
> > is on the way. Has that been rolled in as well?
>
> Sorry, not too sure. :-(
Albert's description:
> Just thought I'd let you know what's up. We think we're close to
> getting to the bottom of this. We think the race is between iput() and
> __refile_inode(). The Redhat kernel is of course different than the
> mainline kernel, but I think the same bug exists in the mainline. The
> example below illustrates the race between iput() and __sync_one(), but
> it could occur with other areas that all __refile_inode(). For us, I
> think we're hitting it in __sync_one() and prune_icache(). (The
> prune_icache() call to __refile_inode() doesn't seem to be in the
> mainline though).
>
> proc 0:
> calls iput() and locks inode_lock.
> iput removes the inode off of the i_list
> unlocks inode_lock (proc 1 will now at some point grab inode_lock)
> calls clear_inode()
> (this is key) gets past the call to wait_on_inode();
> at this point clear_inode() and the remainder of iput() does not care
> about I_LOCK or inode_lock.
>
> proc 1:
> calls __sync_one() with inode_lock.
> sets I_LOCK
> do stuff before __refile_inode() is called, all I_LOCK/inode_lock stuff
> doesn't matter.
>
> proc 0:
> sets i_state = I_CLEAR
> iput calls destroy_inode()
>
> proc 1:
> calls __refile_inode
>
> and we have ourselves a corrupted inode on the inode_unused list.
>
> I'm not sure if you can see it or not, but Redhat bugzilla 155289 is
> tracking this.
>
> Al
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: RE: oops in 2.4.25 prune_icache() called from kswapd
2005-06-18 19:45 ` RE: oops in 2.4.25 prune_icache() called from kswapd Marcelo Tosatti
@ 2005-06-19 1:11 ` Chris Caputo
2005-06-18 20:33 ` Marcelo Tosatti
0 siblings, 1 reply; 6+ messages in thread
From: Chris Caputo @ 2005-06-19 1:11 UTC (permalink / raw)
To: Marcelo Tosatti
Cc: Albert Chu, David Woodhouse, Al Viro, Prakash Velupula,
linux-fsdevel, lwoodman
Hi,
Marcello, your patch didn't come through. Unfortunately I can't test it
anymore since my environment has now changed, but hopefully Albert or
Prakash can try it when you resend.
Thanks,
Chris
On Sat, 18 Jun 2005, Marcelo Tosatti wrote:
> Hi,
>
> Shame the RH bugzilla (#155289) requires super priviledges to be
> accessed.
>
> I've got around reading Albert's description of the race - thanks BTW.
> (its attached below for reference)
>
> It seems to me that only window open in mainline is between iput() and
> prune_icache(), while iput() sleeps on sync_one() with the inode
> being:
>
> - on the unused list
> - and with i_count set to zero
>
> prune_icache() is free to invalidate and destroy the inode in the meantime,
> causing iput()'s sync_one() to __refile_inode() the NULL entry to the
> unused list later on.
>
> If that is indeed the case, removing __refile_inode() from the nonzero
> inode->i_nlink path should close that window.
>
> Chris, can you please test the attached "iput-iprune-race.patch" with
> your usual irqbalance enable environment ?
>
>
> On Sat, Jun 18, 2005 at 01:02:20PM -0700, Albert Chu wrote:
>> Howdy everyone,
>>
>>> Albert Chu, CC'ed, has suggested the below as a fix. Albert, any new
>>> info on this, or have these two patches cleared up the problem well?
>>
>> The __refile_inode() patch below fixed the problem for us on our
>> clusters (running RHEL3). The clear_inode() patch is something Redhat
>> (I think Larry Woodman) gave to us to fix a problem he believes is in
>> the same general area. We haven't had any additional problems adding
>> his second patch.
>>
>>> BTW, the below mail says that these are workarounds and a real fix
>>> is on the way. Has that been rolled in as well?
>>
>> Sorry, not too sure. :-(
>
> Albert's description:
>
>> Just thought I'd let you know what's up. We think we're close to
>> getting to the bottom of this. We think the race is between iput() and
>> __refile_inode(). The Redhat kernel is of course different than the
>> mainline kernel, but I think the same bug exists in the mainline. The
>> example below illustrates the race between iput() and __sync_one(), but
>> it could occur with other areas that all __refile_inode(). For us, I
>> think we're hitting it in __sync_one() and prune_icache(). (The
>> prune_icache() call to __refile_inode() doesn't seem to be in the
>> mainline though).
>>
>> proc 0:
>> calls iput() and locks inode_lock.
>> iput removes the inode off of the i_list
>> unlocks inode_lock (proc 1 will now at some point grab inode_lock)
>> calls clear_inode()
>> (this is key) gets past the call to wait_on_inode();
>> at this point clear_inode() and the remainder of iput() does not care
>> about I_LOCK or inode_lock.
>>
>> proc 1:
>> calls __sync_one() with inode_lock.
>> sets I_LOCK
>> do stuff before __refile_inode() is called, all I_LOCK/inode_lock stuff
>> doesn't matter.
>>
>> proc 0:
>> sets i_state = I_CLEAR
>> iput calls destroy_inode()
>>
>> proc 1:
>> calls __refile_inode
>>
>> and we have ourselves a corrupted inode on the inode_unused list.
>>
>> I'm not sure if you can see it or not, but Redhat bugzilla 155289 is
>> tracking this.
>>
>> Al
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: RE: oops in 2.4.25 prune_icache() called from kswapd
2005-06-19 1:11 ` Chris Caputo
@ 2005-06-18 20:33 ` Marcelo Tosatti
0 siblings, 0 replies; 6+ messages in thread
From: Marcelo Tosatti @ 2005-06-18 20:33 UTC (permalink / raw)
To: Chris Caputo
Cc: Albert Chu, David Woodhouse, Al Viro, Prakash Velupula,
linux-fsdevel, lwoodman
On Sun, Jun 19, 2005 at 01:11:20AM +0000, Chris Caputo wrote:
> Hi,
>
> Marcello, your patch didn't come through. Unfortunately I can't test it
> anymore since my environment has now changed, but hopefully Albert or
> Prakash can try it when you resend.
Albert, Prakash, any of you using stock v2.4?
Here's the diff:
--- a/fs/inode.c.orig 2005-06-18 11:19:21.508857600 -0300
+++ b/fs/inode.c 2005-06-18 11:21:03.925287936 -0300
@@ -1238,15 +1238,18 @@
BUG();
} else {
if (!list_empty(&inode->i_hash)) {
- if (!(inode->i_state & (I_DIRTY|I_LOCK)))
- __refile_inode(inode);
- inodes_stat.nr_unused++;
- spin_unlock(&inode_lock);
- if (!sb || (sb->s_flags & MS_ACTIVE))
+ if (!sb || (sb->s_flags & MS_ACTIVE)) {
+ if (!(inode->i_state &
+ (I_DIRTY|I_LOCK))) {
+ __refile_inode(inode);
+ inodes_stat.nr_unused++;
+ }
+ spin_unlock(&inode_lock);
return;
+ }
+ spin_unlock(&inode_lock);
write_inode_now(inode, 1);
spin_lock(&inode_lock);
- inodes_stat.nr_unused--;
list_del_init(&inode->i_hash);
}
list_del_init(&inode->i_list);
>
> Thanks,
> Chris
>
> On Sat, 18 Jun 2005, Marcelo Tosatti wrote:
> >Hi,
> >
> >Shame the RH bugzilla (#155289) requires super priviledges to be
> >accessed.
> >
> >I've got around reading Albert's description of the race - thanks BTW.
> >(its attached below for reference)
> >
> >It seems to me that only window open in mainline is between iput() and
> >prune_icache(), while iput() sleeps on sync_one() with the inode
> >being:
> >
> >- on the unused list
> >- and with i_count set to zero
> >
> >prune_icache() is free to invalidate and destroy the inode in the meantime,
> >causing iput()'s sync_one() to __refile_inode() the NULL entry to the
> >unused list later on.
> >
> >If that is indeed the case, removing __refile_inode() from the nonzero
> >inode->i_nlink path should close that window.
> >
> >Chris, can you please test the attached "iput-iprune-race.patch" with
> >your usual irqbalance enable environment ?
> >
> >
> >On Sat, Jun 18, 2005 at 01:02:20PM -0700, Albert Chu wrote:
> >>Howdy everyone,
> >>
> >>>Albert Chu, CC'ed, has suggested the below as a fix. Albert, any new
> >>>info on this, or have these two patches cleared up the problem well?
> >>
> >>The __refile_inode() patch below fixed the problem for us on our
> >>clusters (running RHEL3). The clear_inode() patch is something Redhat
> >>(I think Larry Woodman) gave to us to fix a problem he believes is in
> >>the same general area. We haven't had any additional problems adding
> >>his second patch.
> >>
> >>>BTW, the below mail says that these are workarounds and a real fix
> >>>is on the way. Has that been rolled in as well?
> >>
> >>Sorry, not too sure. :-(
> >
> >Albert's description:
> >
> >>Just thought I'd let you know what's up. We think we're close to
> >>getting to the bottom of this. We think the race is between iput() and
> >>__refile_inode(). The Redhat kernel is of course different than the
> >>mainline kernel, but I think the same bug exists in the mainline. The
> >>example below illustrates the race between iput() and __sync_one(), but
> >>it could occur with other areas that all __refile_inode(). For us, I
> >>think we're hitting it in __sync_one() and prune_icache(). (The
> >>prune_icache() call to __refile_inode() doesn't seem to be in the
> >>mainline though).
> >>
> >>proc 0:
> >>calls iput() and locks inode_lock.
> >>iput removes the inode off of the i_list
> >>unlocks inode_lock (proc 1 will now at some point grab inode_lock)
> >>calls clear_inode()
> >>(this is key) gets past the call to wait_on_inode();
> >>at this point clear_inode() and the remainder of iput() does not care
> >>about I_LOCK or inode_lock.
> >>
> >>proc 1:
> >>calls __sync_one() with inode_lock.
> >>sets I_LOCK
> >>do stuff before __refile_inode() is called, all I_LOCK/inode_lock stuff
> >>doesn't matter.
> >>
> >>proc 0:
> >>sets i_state = I_CLEAR
> >>iput calls destroy_inode()
> >>
> >>proc 1:
> >>calls __refile_inode
> >>
> >>and we have ourselves a corrupted inode on the inode_unused list.
> >>
> >>I'm not sure if you can see it or not, but Redhat bugzilla 155289 is
> >>tracking this.
> >>
> >>Al
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: RE: oops in 2.4.25 prune_icache() called from kswapd
@ 2005-06-19 22:51 Velupula, Prakash
2005-06-19 23:07 ` Chris Caputo
0 siblings, 1 reply; 6+ messages in thread
From: Velupula, Prakash @ 2005-06-19 22:51 UTC (permalink / raw)
To: Marcelo Tosatti, Chris Caputo
Cc: Albert Chu, David Woodhouse, Al Viro, linux-fsdevel, lwoodman
Hi Marcelo,
We are using 2.4.20 and would be able test it. But before that, we are
trying to recreate the problem. We just had one occurrence of this
problem. Do you have any inputs on reproducing this?
Thanks,
Prakash
-----Original Message-----
From: Marcelo Tosatti [mailto:marcelo.tosatti@cyclades.com]
Sent: Saturday, June 18, 2005 3:34 PM
To: Chris Caputo
Cc: Albert Chu; David Woodhouse; Al Viro; Velupula, Prakash;
linux-fsdevel@vger.kernel.org; lwoodman@redhat.com
Subject: Re: RE: oops in 2.4.25 prune_icache() called from kswapd
On Sun, Jun 19, 2005 at 01:11:20AM +0000, Chris Caputo wrote:
> Hi,
>
> Marcello, your patch didn't come through. Unfortunately I can't test
> it anymore since my environment has now changed, but hopefully Albert
> or Prakash can try it when you resend.
Albert, Prakash, any of you using stock v2.4?
Here's the diff:
--- a/fs/inode.c.orig 2005-06-18 11:19:21.508857600 -0300
+++ b/fs/inode.c 2005-06-18 11:21:03.925287936 -0300
@@ -1238,15 +1238,18 @@
BUG();
} else {
if (!list_empty(&inode->i_hash)) {
- if (!(inode->i_state &
(I_DIRTY|I_LOCK)))
- __refile_inode(inode);
- inodes_stat.nr_unused++;
- spin_unlock(&inode_lock);
- if (!sb || (sb->s_flags & MS_ACTIVE))
+ if (!sb || (sb->s_flags & MS_ACTIVE)) {
+ if (!(inode->i_state &
+ (I_DIRTY|I_LOCK))) {
+ __refile_inode(inode);
+ inodes_stat.nr_unused++;
+ }
+ spin_unlock(&inode_lock);
return;
+ }
+ spin_unlock(&inode_lock);
write_inode_now(inode, 1);
spin_lock(&inode_lock);
- inodes_stat.nr_unused--;
list_del_init(&inode->i_hash);
}
list_del_init(&inode->i_list);
>
> Thanks,
> Chris
>
> On Sat, 18 Jun 2005, Marcelo Tosatti wrote:
> >Hi,
> >
> >Shame the RH bugzilla (#155289) requires super priviledges to be
> >accessed.
> >
> >I've got around reading Albert's description of the race - thanks
BTW.
> >(its attached below for reference)
> >
> >It seems to me that only window open in mainline is between iput()
> >and prune_icache(), while iput() sleeps on sync_one() with the inode
> >being:
> >
> >- on the unused list
> >- and with i_count set to zero
> >
> >prune_icache() is free to invalidate and destroy the inode in the
> >meantime, causing iput()'s sync_one() to __refile_inode() the NULL
> >entry to the unused list later on.
> >
> >If that is indeed the case, removing __refile_inode() from the
> >nonzero
> >inode->i_nlink path should close that window.
> >
> >Chris, can you please test the attached "iput-iprune-race.patch" with
> >your usual irqbalance enable environment ?
> >
> >
> >On Sat, Jun 18, 2005 at 01:02:20PM -0700, Albert Chu wrote:
> >>Howdy everyone,
> >>
> >>>Albert Chu, CC'ed, has suggested the below as a fix. Albert, any
> >>>new info on this, or have these two patches cleared up the problem
well?
> >>
> >>The __refile_inode() patch below fixed the problem for us on our
> >>clusters (running RHEL3). The clear_inode() patch is something
> >>Redhat (I think Larry Woodman) gave to us to fix a problem he
> >>believes is in the same general area. We haven't had any additional
> >>problems adding his second patch.
> >>
> >>>BTW, the below mail says that these are workarounds and a real fix
> >>>is on the way. Has that been rolled in as well?
> >>
> >>Sorry, not too sure. :-(
> >
> >Albert's description:
> >
> >>Just thought I'd let you know what's up. We think we're close to
> >>getting to the bottom of this. We think the race is between iput()
> >>and __refile_inode(). The Redhat kernel is of course different than
> >>the mainline kernel, but I think the same bug exists in the
> >>mainline. The example below illustrates the race between iput() and
> >>__sync_one(), but it could occur with other areas that all
> >>__refile_inode(). For us, I think we're hitting it in __sync_one()
> >>and prune_icache(). (The
> >>prune_icache() call to __refile_inode() doesn't seem to be in the
> >>mainline though).
> >>
> >>proc 0:
> >>calls iput() and locks inode_lock.
> >>iput removes the inode off of the i_list unlocks inode_lock (proc 1
> >>will now at some point grab inode_lock) calls clear_inode() (this is
> >>key) gets past the call to wait_on_inode(); at this point
> >>clear_inode() and the remainder of iput() does not care about I_LOCK
> >>or inode_lock.
> >>
> >>proc 1:
> >>calls __sync_one() with inode_lock.
> >>sets I_LOCK
> >>do stuff before __refile_inode() is called, all I_LOCK/inode_lock
> >>stuff doesn't matter.
> >>
> >>proc 0:
> >>sets i_state = I_CLEAR
> >>iput calls destroy_inode()
> >>
> >>proc 1:
> >>calls __refile_inode
> >>
> >>and we have ourselves a corrupted inode on the inode_unused list.
> >>
> >>I'm not sure if you can see it or not, but Redhat bugzilla 155289 is
> >>tracking this.
> >>
> >>Al
^ permalink raw reply [flat|nested] 6+ messages in thread* RE: RE: oops in 2.4.25 prune_icache() called from kswapd
2005-06-19 22:51 Velupula, Prakash
@ 2005-06-19 23:07 ` Chris Caputo
2005-07-26 11:01 ` Marcelo Tosatti
0 siblings, 1 reply; 6+ messages in thread
From: Chris Caputo @ 2005-06-19 23:07 UTC (permalink / raw)
To: Velupula, Prakash
Cc: Marcelo Tosatti, Albert Chu, David Woodhouse, Al Viro,
linux-fsdevel, lwoodman
My basic repro method was:
--
0) start irqbalance
1) run loop_dbench, which is the following dbench script which uses
client_plain.txt:
#!/bin/sh
while [ 1 ]
do
date
dbench 2
2) wait for oops
--
I think I was using dbench-2.1:
http://samba.org/ftp/tridge/dbench/dbench-2.1.tar.gz
In my case irqbalance was key. If I didn't run it I never got the
problem. I think irqbalance just did a good job of exasperating a race
condition in some way.
Chris
On Sun, 19 Jun 2005, Velupula, Prakash wrote:
> Hi Marcelo,
>
> We are using 2.4.20 and would be able test it. But before that, we are
> trying to recreate the problem. We just had one occurrence of this
> problem. Do you have any inputs on reproducing this?
>
> Thanks,
> Prakash
>
> -----Original Message-----
> From: Marcelo Tosatti [mailto:marcelo.tosatti@cyclades.com]
> Sent: Saturday, June 18, 2005 3:34 PM
> To: Chris Caputo
> Cc: Albert Chu; David Woodhouse; Al Viro; Velupula, Prakash;
> linux-fsdevel@vger.kernel.org; lwoodman@redhat.com
> Subject: Re: RE: oops in 2.4.25 prune_icache() called from kswapd
>
> On Sun, Jun 19, 2005 at 01:11:20AM +0000, Chris Caputo wrote:
>> Hi,
>>
>> Marcello, your patch didn't come through. Unfortunately I can't test
>> it anymore since my environment has now changed, but hopefully Albert
>> or Prakash can try it when you resend.
>
> Albert, Prakash, any of you using stock v2.4?
>
> Here's the diff:
>
> --- a/fs/inode.c.orig 2005-06-18 11:19:21.508857600 -0300
> +++ b/fs/inode.c 2005-06-18 11:21:03.925287936 -0300
> @@ -1238,15 +1238,18 @@
> BUG();
> } else {
> if (!list_empty(&inode->i_hash)) {
> - if (!(inode->i_state &
> (I_DIRTY|I_LOCK)))
> - __refile_inode(inode);
> - inodes_stat.nr_unused++;
> - spin_unlock(&inode_lock);
> - if (!sb || (sb->s_flags & MS_ACTIVE))
> + if (!sb || (sb->s_flags & MS_ACTIVE)) {
> + if (!(inode->i_state &
> + (I_DIRTY|I_LOCK))) {
> + __refile_inode(inode);
> + inodes_stat.nr_unused++;
> + }
> + spin_unlock(&inode_lock);
> return;
> + }
> + spin_unlock(&inode_lock);
> write_inode_now(inode, 1);
> spin_lock(&inode_lock);
> - inodes_stat.nr_unused--;
> list_del_init(&inode->i_hash);
> }
> list_del_init(&inode->i_list);
>>
>> Thanks,
>> Chris
>>
>> On Sat, 18 Jun 2005, Marcelo Tosatti wrote:
>>> Hi,
>>>
>>> Shame the RH bugzilla (#155289) requires super priviledges to be
>>> accessed.
>>>
>>> I've got around reading Albert's description of the race - thanks
> BTW.
>>> (its attached below for reference)
>>>
>>> It seems to me that only window open in mainline is between iput()
>>> and prune_icache(), while iput() sleeps on sync_one() with the inode
>>> being:
>>>
>>> - on the unused list
>>> - and with i_count set to zero
>>>
>>> prune_icache() is free to invalidate and destroy the inode in the
>>> meantime, causing iput()'s sync_one() to __refile_inode() the NULL
>>> entry to the unused list later on.
>>>
>>> If that is indeed the case, removing __refile_inode() from the
>>> nonzero
>>> inode->i_nlink path should close that window.
>>>
>>> Chris, can you please test the attached "iput-iprune-race.patch" with
>
>>> your usual irqbalance enable environment ?
>>>
>>>
>>> On Sat, Jun 18, 2005 at 01:02:20PM -0700, Albert Chu wrote:
>>>> Howdy everyone,
>>>>
>>>>> Albert Chu, CC'ed, has suggested the below as a fix. Albert, any
>>>>> new info on this, or have these two patches cleared up the problem
> well?
>>>>
>>>> The __refile_inode() patch below fixed the problem for us on our
>>>> clusters (running RHEL3). The clear_inode() patch is something
>>>> Redhat (I think Larry Woodman) gave to us to fix a problem he
>>>> believes is in the same general area. We haven't had any additional
>
>>>> problems adding his second patch.
>>>>
>>>>> BTW, the below mail says that these are workarounds and a real fix
>>>>> is on the way. Has that been rolled in as well?
>>>>
>>>> Sorry, not too sure. :-(
>>>
>>> Albert's description:
>>>
>>>> Just thought I'd let you know what's up. We think we're close to
>>>> getting to the bottom of this. We think the race is between iput()
>>>> and __refile_inode(). The Redhat kernel is of course different than
>
>>>> the mainline kernel, but I think the same bug exists in the
>>>> mainline. The example below illustrates the race between iput() and
>
>>>> __sync_one(), but it could occur with other areas that all
>>>> __refile_inode(). For us, I think we're hitting it in __sync_one()
>>>> and prune_icache(). (The
>>>> prune_icache() call to __refile_inode() doesn't seem to be in the
>>>> mainline though).
>>>>
>>>> proc 0:
>>>> calls iput() and locks inode_lock.
>>>> iput removes the inode off of the i_list unlocks inode_lock (proc 1
>>>> will now at some point grab inode_lock) calls clear_inode() (this is
>
>>>> key) gets past the call to wait_on_inode(); at this point
>>>> clear_inode() and the remainder of iput() does not care about I_LOCK
>
>>>> or inode_lock.
>>>>
>>>> proc 1:
>>>> calls __sync_one() with inode_lock.
>>>> sets I_LOCK
>>>> do stuff before __refile_inode() is called, all I_LOCK/inode_lock
>>>> stuff doesn't matter.
>>>>
>>>> proc 0:
>>>> sets i_state = I_CLEAR
>>>> iput calls destroy_inode()
>>>>
>>>> proc 1:
>>>> calls __refile_inode
>>>>
>>>> and we have ourselves a corrupted inode on the inode_unused list.
>>>>
>>>> I'm not sure if you can see it or not, but Redhat bugzilla 155289 is
>
>>>> tracking this.
>>>>
>>>> Al
>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: RE: oops in 2.4.25 prune_icache() called from kswapd
2005-06-19 23:07 ` Chris Caputo
@ 2005-07-26 11:01 ` Marcelo Tosatti
0 siblings, 0 replies; 6+ messages in thread
From: Marcelo Tosatti @ 2005-07-26 11:01 UTC (permalink / raw)
To: Chris Caputo
Cc: Velupula, Prakash, Albert Chu, David Woodhouse, Al Viro,
linux-fsdevel, lwoodman
FYI
commit cc54d1333e409f714aa9c7db63f7f9ed07cc57a9
tree f301f581dd4389028f8b2588940d456904e552f1
parent 2e8f68c45925123d33d476ce369b570bd989dd9a
author Larry Woodman <lwoodman@redhat.com> Fri, 15 Jul 2005 11:32:08 -0400
committer Marcelo Tosatti <marcelo@dmt.cnet> Tue, 26 Jul 2005 07:52:46 -0300
[PATCH] workaround inode cache (prune_icache/__refile_inode) SMP races
Over the past couple of weeks we have seen two races in the inode cache
code. The first is between [dispose_list()] and __refile_inode() and the
second is between prune_icache() and truncate_inodes(). I posted both of
these patches but wanted to make sure they got properly reviewed and
included in RHEL3-U6.
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -297,7 +297,7 @@ static inline void __refile_inode(struct
{
struct list_head *to;
- if (inode->i_state & I_FREEING)
+ if (inode->i_state & (I_FREEING|I_CLEAR))
return;
if (list_empty(&inode->i_hash))
return;
@@ -634,7 +634,9 @@ void clear_inode(struct inode *inode)
cdput(inode->i_cdev);
inode->i_cdev = NULL;
}
+ spin_lock(&inode_lock);
inode->i_state = I_CLEAR;
+ spin_unlock(&inode_lock);
}
/*
On Sun, Jun 19, 2005 at 11:07:44PM +0000, Chris Caputo wrote:
> My basic repro method was:
>
> --
> 0) start irqbalance
> 1) run loop_dbench, which is the following dbench script which uses
> client_plain.txt:
>
> #!/bin/sh
>
> while [ 1 ]
> do
> date
> dbench 2
> 2) wait for oops
> --
>
> I think I was using dbench-2.1:
>
> http://samba.org/ftp/tridge/dbench/dbench-2.1.tar.gz
>
> In my case irqbalance was key. If I didn't run it I never got the
> problem. I think irqbalance just did a good job of exasperating a race
> condition in some way.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-07-26 23:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <390de35bed.35bed390de@llnl.gov>
2005-06-18 19:45 ` RE: oops in 2.4.25 prune_icache() called from kswapd Marcelo Tosatti
2005-06-19 1:11 ` Chris Caputo
2005-06-18 20:33 ` Marcelo Tosatti
2005-06-19 22:51 Velupula, Prakash
2005-06-19 23:07 ` Chris Caputo
2005-07-26 11:01 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).