* Re: crashes in prune_icache() in 2.4.26 [not found] <Pine.LNX.4.44.0405302141360.12337-100000@nacho.alt.net> @ 2004-05-31 7:00 ` Chris Caputo 2004-05-31 19:38 ` Chris Caputo 0 siblings, 1 reply; 3+ messages in thread From: Chris Caputo @ 2004-05-31 7:00 UTC (permalink / raw) To: Rik van Riel; +Cc: linux-kernel [CC'ed lkml in case anyone else wants to take a shot.] A little more info... I added some printk's to the function to highlight the input value of parameter 'goal' and also to show where the function was returning. My understanding is these printk's all happened in rapid succession: entry > prune_icache(goal = 92370) exit < prune_icache() at if (goal <= 0) Above the function completed without entering the CONFIG_HIGHMEM while loop. entry > prune_icache(goal = 94037) exit < prune_icache() - end of function Above the function went through the CONFIG_HIGHMEM while loop. This was the first time this happened since boot, after a number of prune_icache() calls that had returned prior to the CONFIG_HIGHMEM while loop. entry > prune_icache(goal = 98609) Unable to handle kernel NULL pointer dereference at virtual address 00000004 The final printk above shows the function being entered and then hitting the NULL dereference. Chris On Sun, 30 May 2004, Chris Caputo wrote: > Hi. I have been experiencing a number of crashes in fs/inode.c's > prune_icache() function. I found on linux.bkbits.net that you made the > most recent major change to this function back in January. With that in > mind I hope it is okay to write directly to you. > > I have experienced two kinds of crashes with this function. > > The first is in the older part of the code. Basically the inode_unused > list is somehow getting corrupt and when it does an Oops happens at: > > entry = entry->prev; (line 808 of the 2.4.26 fs/inode.c) > > I haven't yet figured out how it is getting corrupt so any tips welcome. > > A second problem I have seen is that my system has gotten into an infinite > loop in the while loop in the CONFIG_HIGHMEM part of the prune_icache() > code. I haven't yet figured out why. But I am curious about the code at > the beginning of the loop: > > while (goal-- > 0) { > if (list_empty(&inode_unused_pagecache)) > break; > entry = inode_unused_pagecache.prev; > list_del(entry); > list_add(entry, &inode_unused_pagecache); > > Is the intent of the last 3 lines to remove the entry from the end of the > linked-list and then add it to the front, as a way of traversing the list? > Or is it intended that the add be an add to the inode_unused list as > opposed to the inode_unused_pagecache list? > > I'd love to figure out the problems I am experiencing, so any advice on > how to proceed is welcome. The bug happens every few days on our main > fileserver and I have been able to reproduce it on a test fileserver too. > > Chris ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: crashes in prune_icache() in 2.4.26 2004-05-31 7:00 ` crashes in prune_icache() in 2.4.26 Chris Caputo @ 2004-05-31 19:38 ` Chris Caputo 2004-06-21 1:56 ` Marcelo Tosatti 0 siblings, 1 reply; 3+ messages in thread From: Chris Caputo @ 2004-05-31 19:38 UTC (permalink / raw) To: linux-kernel, Rik van Riel Made a little more progress... Turns out the infinite loop wasn't in the CONFIG_HIGMEM while() loop, but rather in the while() loop at the top of the code. Sorry about that. I misunderstood the disassembly until now. So basically both types of crashes I am seeing (NULL deref and infinite loop) are happening in the primary/top while() loop of prune_icache() and I suspect they are both the result of a corrupted inode_unused list. Now I am trying to figure out where/how the inode_unused list is getting corrupted... If anyone has any existing code for validating list integrity, which I could sprinkle around the code, I'd love a copy. Thanks, Chris On Mon, 31 May 2004, Chris Caputo wrote: > [CC'ed lkml in case anyone else wants to take a shot.] > > A little more info... I added some printk's to the function to highlight > the input value of parameter 'goal' and also to show where the function > was returning. > > My understanding is these printk's all happened in rapid succession: > > entry > prune_icache(goal = 92370) > exit < prune_icache() at if (goal <= 0) > > Above the function completed without entering the CONFIG_HIGHMEM while > loop. > > entry > prune_icache(goal = 94037) > exit < prune_icache() - end of function > > Above the function went through the CONFIG_HIGHMEM while loop. This was > the first time this happened since boot, after a number of > prune_icache() calls that had returned prior to the CONFIG_HIGHMEM while > loop. > > entry > prune_icache(goal = 98609) > Unable to handle kernel NULL pointer dereference at virtual address 00000004 > > The final printk above shows the function being entered and then hitting > the NULL dereference. > > Chris > > On Sun, 30 May 2004, Chris Caputo wrote: > > Hi. I have been experiencing a number of crashes in fs/inode.c's > > prune_icache() function. I found on linux.bkbits.net that you made the > > most recent major change to this function back in January. With that in > > mind I hope it is okay to write directly to you. > > > > I have experienced two kinds of crashes with this function. > > > > The first is in the older part of the code. Basically the inode_unused > > list is somehow getting corrupt and when it does an Oops happens at: > > > > entry = entry->prev; (line 808 of the 2.4.26 fs/inode.c) > > > > I haven't yet figured out how it is getting corrupt so any tips welcome. > > > > A second problem I have seen is that my system has gotten into an infinite > > loop in the while loop in the CONFIG_HIGHMEM part of the prune_icache() > > code. I haven't yet figured out why. But I am curious about the code at > > the beginning of the loop: > > > > while (goal-- > 0) { > > if (list_empty(&inode_unused_pagecache)) > > break; > > entry = inode_unused_pagecache.prev; > > list_del(entry); > > list_add(entry, &inode_unused_pagecache); > > > > Is the intent of the last 3 lines to remove the entry from the end of the > > linked-list and then add it to the front, as a way of traversing the list? > > Or is it intended that the add be an add to the inode_unused list as > > opposed to the inode_unused_pagecache list? > > > > I'd love to figure out the problems I am experiencing, so any advice on > > how to proceed is welcome. The bug happens every few days on our main > > fileserver and I have been able to reproduce it on a test fileserver too. > > > > Chris > > > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: crashes in prune_icache() in 2.4.26 2004-05-31 19:38 ` Chris Caputo @ 2004-06-21 1:56 ` Marcelo Tosatti 0 siblings, 0 replies; 3+ messages in thread From: Marcelo Tosatti @ 2004-06-21 1:56 UTC (permalink / raw) To: Chris Caputo; +Cc: linux-kernel, Rik van Riel, Trond Myklebust On Mon, May 31, 2004 at 12:38:48PM -0700, Chris Caputo wrote: > Made a little more progress... > > Turns out the infinite loop wasn't in the CONFIG_HIGMEM while() loop, but > rather in the while() loop at the top of the code. Sorry about that. I > misunderstood the disassembly until now. > > So basically both types of crashes I am seeing (NULL deref and infinite > loop) are happening in the primary/top while() loop of prune_icache() > and I suspect they are both the result of a corrupted inode_unused list. > > Now I am trying to figure out where/how the inode_unused list is getting > corrupted... If anyone has any existing code for validating list > integrity, which I could sprinkle around the code, I'd love a copy. > > Thanks, > Chris > > On Mon, 31 May 2004, Chris Caputo wrote: > > [CC'ed lkml in case anyone else wants to take a shot.] > > > > A little more info... I added some printk's to the function to highlight > > the input value of parameter 'goal' and also to show where the function > > was returning. > > > > My understanding is these printk's all happened in rapid succession: > > > > entry > prune_icache(goal = 92370) > > exit < prune_icache() at if (goal <= 0) > > > > Above the function completed without entering the CONFIG_HIGHMEM while > > loop. > > > > entry > prune_icache(goal = 94037) > > exit < prune_icache() - end of function > > > > Above the function went through the CONFIG_HIGHMEM while loop. This was > > the first time this happened since boot, after a number of > > prune_icache() calls that had returned prior to the CONFIG_HIGHMEM while > > loop. > > > > entry > prune_icache(goal = 98609) > > Unable to handle kernel NULL pointer dereference at virtual address 00000004 > > > > The final printk above shows the function being entered and then hitting > > the NULL dereference. Chris, Can you please post the full oops message ksymooped? I hope you saved that. > > > > Chris > > > > On Sun, 30 May 2004, Chris Caputo wrote: > > > Hi. I have been experiencing a number of crashes in fs/inode.c's > > > prune_icache() function. I found on linux.bkbits.net that you made the > > > most recent major change to this function back in January. With that in > > > mind I hope it is okay to write directly to you. > > > > > > I have experienced two kinds of crashes with this function. > > > > > > The first is in the older part of the code. Basically the inode_unused > > > list is somehow getting corrupt and when it does an Oops happens at: > > > > > > entry = entry->prev; (line 808 of the 2.4.26 fs/inode.c) > > > > > > I haven't yet figured out how it is getting corrupt so any tips welcome. > > > > > > A second problem I have seen is that my system has gotten into an infinite > > > loop in the while loop in the CONFIG_HIGHMEM part of the prune_icache() > > > code. I haven't yet figured out why. But I am curious about the code at > > > the beginning of the loop: > > > > > > while (goal-- > 0) { > > > if (list_empty(&inode_unused_pagecache)) > > > break; > > > entry = inode_unused_pagecache.prev; > > > list_del(entry); > > > list_add(entry, &inode_unused_pagecache); > > > > > > Is the intent of the last 3 lines to remove the entry from the end of the > > > linked-list and then add it to the front, as a way of traversing the list? > > > Or is it intended that the add be an add to the inode_unused list as > > > opposed to the inode_unused_pagecache list? > > > > > > I'd love to figure out the problems I am experiencing, so any advice on > > > how to proceed is welcome. The bug happens every few days on our main > > > fileserver and I have been able to reproduce it on a test fileserver too. > > > > > > Chris ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2004-06-21 2:03 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Pine.LNX.4.44.0405302141360.12337-100000@nacho.alt.net>
2004-05-31 7:00 ` crashes in prune_icache() in 2.4.26 Chris Caputo
2004-05-31 19:38 ` Chris Caputo
2004-06-21 1:56 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox