* possible bug in find_get_pages
@ 2009-03-06 19:26 mark gross
2009-03-06 19:28 ` Christoph Lameter
2009-03-06 23:47 ` Minchan Kim
0 siblings, 2 replies; 9+ messages in thread
From: mark gross @ 2009-03-06 19:26 UTC (permalink / raw)
To: linux-mm
I'm looking at a system hang (note: new hardware going under stress
tests using a ubuntu 2.6.27-11-generic)
It seems that page->_count == 0 at some point on some overnight runs
with locks the system into a tight loop from the repeat: and a goto
repeat in find_get_pages.
Code inserted for convenience:
unsigned find_get_pages(struct address_space *mapping, pgoff_t start,
unsigned int nr_pages, struct page **pages)
{
unsigned int i;
unsigned int ret;
unsigned int nr_found;
rcu_read_lock();
restart:
nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
(void ***)pages, start, nr_pages);
ret = 0;
for (i = 0; i < nr_found; i++) {
struct page *page;
repeat:
page = radix_tree_deref_slot((void **)pages[i]);
if (unlikely(!page))
continue;
/*
* this can only trigger if nr_found == 1, making
* livelock
* a non issue.
*/
if (unlikely(page == RADIX_TREE_RETRY))
goto restart;
if (!page_cache_get_speculative(page))
goto repeat; <---------_always_hits_
/* Has the page moved? */
if (unlikely(page != *((void **)pages[i]))) {
page_cache_release(page);
goto repeat;
}
pages[ret] = page;
ret++;
}
rcu_read_unlock();
return ret;
}
My question is that as I look at this code I don't see any way out of it
once I get a page with zero _count from radix_tree_deref_slot, then I
will get the same page forever. The input to radix_tree_deref_slot
never changes so I assume the output should be the same crappy page with
zero _count that drops me on the goto repeat line.
Is this a bug?
Also, is having a page->_count == 0 an unexpected or invalid state?
Thanks!
--mgross
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible bug in find_get_pages
2009-03-06 19:26 possible bug in find_get_pages mark gross
@ 2009-03-06 19:28 ` Christoph Lameter
2009-03-06 21:13 ` mark gross
2009-03-10 10:49 ` Nick Piggin
2009-03-06 23:47 ` Minchan Kim
1 sibling, 2 replies; 9+ messages in thread
From: Christoph Lameter @ 2009-03-06 19:28 UTC (permalink / raw)
To: mark gross; +Cc: linux-mm, npiggin
On Fri, 6 Mar 2009, mark gross wrote:
> It seems that page->_count == 0 at some point on some overnight runs
> with locks the system into a tight loop from the repeat: and a goto
> repeat in find_get_pages.
A page with ref count zero should not be in any mapping. If the page is in
a mapping then the page is used. Therefore the refcount should be > 0.
If there is a page with zero refcount and its in a mapping then something
erroneously decreased the refcount.
Nick wrote the code so I CCed him.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible bug in find_get_pages
2009-03-06 19:28 ` Christoph Lameter
@ 2009-03-06 21:13 ` mark gross
2009-03-06 21:29 ` Christoph Lameter
2009-03-10 10:49 ` Nick Piggin
1 sibling, 1 reply; 9+ messages in thread
From: mark gross @ 2009-03-06 21:13 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm, npiggin
On Fri, Mar 06, 2009 at 02:28:50PM -0500, Christoph Lameter wrote:
> On Fri, 6 Mar 2009, mark gross wrote:
>
> > It seems that page->_count == 0 at some point on some overnight runs
> > with locks the system into a tight loop from the repeat: and a goto
> > repeat in find_get_pages.
>
> A page with ref count zero should not be in any mapping. If the page is in
> a mapping then the page is used. Therefore the refcount should be > 0.
>
> If there is a page with zero refcount and its in a mapping then something
> erroneously decreased the refcount.
>
> Nick wrote the code so I CCed him.
thanks! This is on early hardware so perhaps there isn't anything to
see here.
Still form a static read of the code that goto repeat raises
eyebrows as why would anyone expect to get anything different from
radix_page_deref_slot calling it again with the same arguments?
--mgross
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible bug in find_get_pages
2009-03-06 21:13 ` mark gross
@ 2009-03-06 21:29 ` Christoph Lameter
0 siblings, 0 replies; 9+ messages in thread
From: Christoph Lameter @ 2009-03-06 21:29 UTC (permalink / raw)
To: mark gross; +Cc: linux-mm, npiggin
On Fri, 6 Mar 2009, mark gross wrote:
> Still form a static read of the code that goto repeat raises
> eyebrows as why would anyone expect to get anything different from
> radix_page_deref_slot calling it again with the same arguments?
Another processor may be updating the same structure.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible bug in find_get_pages
2009-03-06 19:26 possible bug in find_get_pages mark gross
2009-03-06 19:28 ` Christoph Lameter
@ 2009-03-06 23:47 ` Minchan Kim
2009-03-09 16:43 ` mark gross
1 sibling, 1 reply; 9+ messages in thread
From: Minchan Kim @ 2009-03-06 23:47 UTC (permalink / raw)
To: mgross; +Cc: linux-mm, Nick Piggin, Christoph Lameter
Nick already found and solved this problem .
It can help you.
http://patchwork.kernel.org/patch/860/
> On Fri, 6 Mar 2009 11:26:25 -0800
> mark gross <mgross@linux.intel.com> wrote:
>
> I'm looking at a system hang (note: new hardware going under stress
> tests using a ubuntu 2.6.27-11-generic)
>
> It seems that page->_count == 0 at some point on some overnight runs
> with locks the system into a tight loop from the repeat: and a goto
> repeat in find_get_pages.
>
> Code inserted for convenience:
>
> unsigned find_get_pages(struct address_space *mapping, pgoff_t start,
> unsigned int nr_pages, struct page **pages)
> {
> unsigned int i;
> unsigned int ret;
> unsigned int nr_found;
>
> rcu_read_lock();
> restart:
> nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
> (void ***)pages, start, nr_pages);
> ret = 0;
> for (i = 0; i < nr_found; i++) {
> struct page *page;
> repeat:
> page = radix_tree_deref_slot((void **)pages[i]);
> if (unlikely(!page))
> continue;
> /*
> * this can only trigger if nr_found == 1, making
> * livelock
> * a non issue.
> */
> if (unlikely(page == RADIX_TREE_RETRY))
> goto restart;
>
> if (!page_cache_get_speculative(page))
> goto repeat; <---------_always_hits_
>
> /* Has the page moved? */
> if (unlikely(page != *((void **)pages[i]))) {
> page_cache_release(page);
> goto repeat;
> }
>
> pages[ret] = page;
> ret++;
> }
> rcu_read_unlock();
> return ret;
> }
>
> My question is that as I look at this code I don't see any way out of it
> once I get a page with zero _count from radix_tree_deref_slot, then I
> will get the same page forever. The input to radix_tree_deref_slot
> never changes so I assume the output should be the same crappy page with
> zero _count that drops me on the goto repeat line.
>
> Is this a bug?
>
> Also, is having a page->_count == 0 an unexpected or invalid state?
>
> Thanks!
>
> --mgross
>
>
>
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kinds Regards
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible bug in find_get_pages
2009-03-06 23:47 ` Minchan Kim
@ 2009-03-09 16:43 ` mark gross
2009-03-10 10:45 ` Nick Piggin
0 siblings, 1 reply; 9+ messages in thread
From: mark gross @ 2009-03-09 16:43 UTC (permalink / raw)
To: Minchan Kim; +Cc: linux-mm, Nick Piggin, Christoph Lameter
On Sat, Mar 07, 2009 at 08:47:32AM +0900, Minchan Kim wrote:
> Nick already found and solved this problem .
> It can help you.
>
> http://patchwork.kernel.org/patch/860/
>
Wow, this reads just like the problem we are seeing. I'll try the
patch and let the test run for a few days!
We've even see it come out of the live lock once in a while as well. I
was thinking cache coherency HW issue until this :)
I'll send an update after running the test.
thanks!
--mgross
>
> > On Fri, 6 Mar 2009 11:26:25 -0800
> > mark gross <mgross@linux.intel.com> wrote:
> >
> > I'm looking at a system hang (note: new hardware going under stress
> > tests using a ubuntu 2.6.27-11-generic)
> >
> > It seems that page->_count == 0 at some point on some overnight runs
> > with locks the system into a tight loop from the repeat: and a goto
> > repeat in find_get_pages.
> >
> > Code inserted for convenience:
> >
> > unsigned find_get_pages(struct address_space *mapping, pgoff_t start,
> > unsigned int nr_pages, struct page **pages)
> > {
> > unsigned int i;
> > unsigned int ret;
> > unsigned int nr_found;
> >
> > rcu_read_lock();
> > restart:
> > nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
> > (void ***)pages, start, nr_pages);
> > ret = 0;
> > for (i = 0; i < nr_found; i++) {
> > struct page *page;
> > repeat:
> > page = radix_tree_deref_slot((void **)pages[i]);
> > if (unlikely(!page))
> > continue;
> > /*
> > * this can only trigger if nr_found == 1, making
> > * livelock
> > * a non issue.
> > */
> > if (unlikely(page == RADIX_TREE_RETRY))
> > goto restart;
> >
> > if (!page_cache_get_speculative(page))
> > goto repeat; <---------_always_hits_
> >
> > /* Has the page moved? */
> > if (unlikely(page != *((void **)pages[i]))) {
> > page_cache_release(page);
> > goto repeat;
> > }
> >
> > pages[ret] = page;
> > ret++;
> > }
> > rcu_read_unlock();
> > return ret;
> > }
> >
> > My question is that as I look at this code I don't see any way out of it
> > once I get a page with zero _count from radix_tree_deref_slot, then I
> > will get the same page forever. The input to radix_tree_deref_slot
> > never changes so I assume the output should be the same crappy page with
> > zero _count that drops me on the goto repeat line.
> >
> > Is this a bug?
> >
> > Also, is having a page->_count == 0 an unexpected or invalid state?
> >
> > Thanks!
> >
> > --mgross
> >
> >
> >
> >
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org. For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>
> --
> Kinds Regards
> Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible bug in find_get_pages
2009-03-09 16:43 ` mark gross
@ 2009-03-10 10:45 ` Nick Piggin
2009-03-10 22:49 ` mark gross
0 siblings, 1 reply; 9+ messages in thread
From: Nick Piggin @ 2009-03-10 10:45 UTC (permalink / raw)
To: mark gross; +Cc: Minchan Kim, linux-mm, Christoph Lameter
On Mon, Mar 09, 2009 at 09:43:16AM -0700, mark gross wrote:
> On Sat, Mar 07, 2009 at 08:47:32AM +0900, Minchan Kim wrote:
> > Nick already found and solved this problem .
> > It can help you.
> >
> > http://patchwork.kernel.org/patch/860/
> >
>
> Wow, this reads just like the problem we are seeing. I'll try the
> patch and let the test run for a few days!
>
> We've even see it come out of the live lock once in a while as well. I
> was thinking cache coherency HW issue until this :)
>
> I'll send an update after running the test.
Note that after some discussion, the accepted fix looks a bit
different (and might potentially fix another problem if the compiler
gets very smart, although gcc doesn't seem to).
Git commit e8c82c2e23e3527e0c9dc195e432c16784d270fa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible bug in find_get_pages
2009-03-06 19:28 ` Christoph Lameter
2009-03-06 21:13 ` mark gross
@ 2009-03-10 10:49 ` Nick Piggin
1 sibling, 0 replies; 9+ messages in thread
From: Nick Piggin @ 2009-03-10 10:49 UTC (permalink / raw)
To: Christoph Lameter; +Cc: mark gross, linux-mm
On Fri, Mar 06, 2009 at 02:28:50PM -0500, Christoph Lameter wrote:
> On Fri, 6 Mar 2009, mark gross wrote:
>
> > It seems that page->_count == 0 at some point on some overnight runs
> > with locks the system into a tight loop from the repeat: and a goto
> > repeat in find_get_pages.
>
> A page with ref count zero should not be in any mapping. If the page is in
> a mapping then the page is used. Therefore the refcount should be > 0.
>
> If there is a page with zero refcount and its in a mapping then something
> erroneously decreased the refcount.
Just for posterity, this isn't _quite_ true any more with Hugh's
variation to the speculative reference method. We now in some
places set the page's refcount to 0 in order to hold off new
speculative references from turning into real references (eg. right
before final checks before page reclaim).
But yes, such a page should not remain both in a mapping and with a
0 refcount for long periods.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible bug in find_get_pages
2009-03-10 10:45 ` Nick Piggin
@ 2009-03-10 22:49 ` mark gross
0 siblings, 0 replies; 9+ messages in thread
From: mark gross @ 2009-03-10 22:49 UTC (permalink / raw)
To: Nick Piggin; +Cc: Minchan Kim, linux-mm, Christoph Lameter
On Tue, Mar 10, 2009 at 11:45:52AM +0100, Nick Piggin wrote:
> On Mon, Mar 09, 2009 at 09:43:16AM -0700, mark gross wrote:
> > On Sat, Mar 07, 2009 at 08:47:32AM +0900, Minchan Kim wrote:
> > > Nick already found and solved this problem .
> > > It can help you.
> > >
> > > http://patchwork.kernel.org/patch/860/
> > >
> >
> > Wow, this reads just like the problem we are seeing. I'll try the
> > patch and let the test run for a few days!
> >
> > We've even see it come out of the live lock once in a while as well. I
> > was thinking cache coherency HW issue until this :)
> >
> > I'll send an update after running the test.
>
> Note that after some discussion, the accepted fix looks a bit
> different (and might potentially fix another problem if the compiler
> gets very smart, although gcc doesn't seem to).
>
> Git commit e8c82c2e23e3527e0c9dc195e432c16784d270fa
Yes, we are testing with this one liner fix, 30rhs and counting. Its
looking pretty good.
thanks!
--mgross
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-03-10 22:45 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-06 19:26 possible bug in find_get_pages mark gross
2009-03-06 19:28 ` Christoph Lameter
2009-03-06 21:13 ` mark gross
2009-03-06 21:29 ` Christoph Lameter
2009-03-10 10:49 ` Nick Piggin
2009-03-06 23:47 ` Minchan Kim
2009-03-09 16:43 ` mark gross
2009-03-10 10:45 ` Nick Piggin
2009-03-10 22:49 ` mark gross
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).