linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* possible bug in find_get_pages
@ 2009-03-06 19:26 mark gross
  2009-03-06 19:28 ` Christoph Lameter
  2009-03-06 23:47 ` Minchan Kim
  0 siblings, 2 replies; 9+ messages in thread
From: mark gross @ 2009-03-06 19:26 UTC (permalink / raw)
  To: linux-mm

I'm looking at a system hang (note: new hardware going under stress
tests using a ubuntu 2.6.27-11-generic)

It seems that page->_count == 0 at some point on some overnight runs
with locks the system into a tight loop from the repeat: and a goto
repeat in find_get_pages. 

Code inserted for convenience:

unsigned find_get_pages(struct address_space *mapping, pgoff_t start,
			    unsigned int nr_pages, struct page **pages)
{
	unsigned int i;
	unsigned int ret;
	unsigned int nr_found;

	rcu_read_lock();
restart:
	nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
				(void ***)pages, start, nr_pages);
	ret = 0;
	for (i = 0; i < nr_found; i++) {
		struct page *page;
repeat:
		page = radix_tree_deref_slot((void **)pages[i]);
		if (unlikely(!page))
			continue;
		/*
		 * this can only trigger if nr_found == 1, making
		 * livelock
		 * a non issue.
		 */
		if (unlikely(page == RADIX_TREE_RETRY))
			goto restart;

		if (!page_cache_get_speculative(page))
			goto repeat; <---------_always_hits_ 

		/* Has the page moved? */
		if (unlikely(page != *((void **)pages[i]))) {
			page_cache_release(page);
			goto repeat;
		}

		pages[ret] = page;
		ret++;
	}
	rcu_read_unlock();
	return ret;
}

My question is that as I look at this code I don't see any way out of it
once I get a page with zero _count from radix_tree_deref_slot, then I
will get the same page forever.  The input to radix_tree_deref_slot
never changes so I assume the output should be the same crappy page with
zero _count that drops me on the goto repeat line.

Is this a bug?

Also, is having a page->_count == 0 an unexpected or invalid state?

Thanks!

--mgross





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible bug in find_get_pages
  2009-03-06 19:26 possible bug in find_get_pages mark gross
@ 2009-03-06 19:28 ` Christoph Lameter
  2009-03-06 21:13   ` mark gross
  2009-03-10 10:49   ` Nick Piggin
  2009-03-06 23:47 ` Minchan Kim
  1 sibling, 2 replies; 9+ messages in thread
From: Christoph Lameter @ 2009-03-06 19:28 UTC (permalink / raw)
  To: mark gross; +Cc: linux-mm, npiggin

On Fri, 6 Mar 2009, mark gross wrote:

> It seems that page->_count == 0 at some point on some overnight runs
> with locks the system into a tight loop from the repeat: and a goto
> repeat in find_get_pages.

A page with ref count zero should not be in any mapping. If the page is in
a mapping then the page is used. Therefore the refcount should be > 0.

If there is a page with zero refcount and its in a mapping then something
erroneously decreased the refcount.

Nick wrote the code so I CCed him.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible bug in find_get_pages
  2009-03-06 19:28 ` Christoph Lameter
@ 2009-03-06 21:13   ` mark gross
  2009-03-06 21:29     ` Christoph Lameter
  2009-03-10 10:49   ` Nick Piggin
  1 sibling, 1 reply; 9+ messages in thread
From: mark gross @ 2009-03-06 21:13 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-mm, npiggin

On Fri, Mar 06, 2009 at 02:28:50PM -0500, Christoph Lameter wrote:
> On Fri, 6 Mar 2009, mark gross wrote:
> 
> > It seems that page->_count == 0 at some point on some overnight runs
> > with locks the system into a tight loop from the repeat: and a goto
> > repeat in find_get_pages.
> 
> A page with ref count zero should not be in any mapping. If the page is in
> a mapping then the page is used. Therefore the refcount should be > 0.
> 
> If there is a page with zero refcount and its in a mapping then something
> erroneously decreased the refcount.
> 
> Nick wrote the code so I CCed him.

thanks!  This is on early hardware so perhaps there isn't anything to
see here.  

Still form a static read of the code that goto repeat raises
eyebrows as why would anyone expect to get anything different from
radix_page_deref_slot calling it again with the same arguments?

--mgross

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible bug in find_get_pages
  2009-03-06 21:13   ` mark gross
@ 2009-03-06 21:29     ` Christoph Lameter
  0 siblings, 0 replies; 9+ messages in thread
From: Christoph Lameter @ 2009-03-06 21:29 UTC (permalink / raw)
  To: mark gross; +Cc: linux-mm, npiggin

On Fri, 6 Mar 2009, mark gross wrote:

> Still form a static read of the code that goto repeat raises
> eyebrows as why would anyone expect to get anything different from
> radix_page_deref_slot calling it again with the same arguments?

Another processor may be updating the same structure.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible bug in find_get_pages
  2009-03-06 19:26 possible bug in find_get_pages mark gross
  2009-03-06 19:28 ` Christoph Lameter
@ 2009-03-06 23:47 ` Minchan Kim
  2009-03-09 16:43   ` mark gross
  1 sibling, 1 reply; 9+ messages in thread
From: Minchan Kim @ 2009-03-06 23:47 UTC (permalink / raw)
  To: mgross; +Cc: linux-mm, Nick Piggin, Christoph Lameter

Nick already found and solved this problem .
It can help you. 

http://patchwork.kernel.org/patch/860/


> On Fri, 6 Mar 2009 11:26:25 -0800
> mark gross <mgross@linux.intel.com> wrote:
>
> I'm looking at a system hang (note: new hardware going under stress
> tests using a ubuntu 2.6.27-11-generic)
> 
> It seems that page->_count == 0 at some point on some overnight runs
> with locks the system into a tight loop from the repeat: and a goto
> repeat in find_get_pages. 
> 
> Code inserted for convenience:
> 
> unsigned find_get_pages(struct address_space *mapping, pgoff_t start,
> 			    unsigned int nr_pages, struct page **pages)
> {
> 	unsigned int i;
> 	unsigned int ret;
> 	unsigned int nr_found;
> 
> 	rcu_read_lock();
> restart:
> 	nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
> 				(void ***)pages, start, nr_pages);
> 	ret = 0;
> 	for (i = 0; i < nr_found; i++) {
> 		struct page *page;
> repeat:
> 		page = radix_tree_deref_slot((void **)pages[i]);
> 		if (unlikely(!page))
> 			continue;
> 		/*
> 		 * this can only trigger if nr_found == 1, making
> 		 * livelock
> 		 * a non issue.
> 		 */
> 		if (unlikely(page == RADIX_TREE_RETRY))
> 			goto restart;
> 
> 		if (!page_cache_get_speculative(page))
> 			goto repeat; <---------_always_hits_ 
> 
> 		/* Has the page moved? */
> 		if (unlikely(page != *((void **)pages[i]))) {
> 			page_cache_release(page);
> 			goto repeat;
> 		}
> 
> 		pages[ret] = page;
> 		ret++;
> 	}
> 	rcu_read_unlock();
> 	return ret;
> }
> 
> My question is that as I look at this code I don't see any way out of it
> once I get a page with zero _count from radix_tree_deref_slot, then I
> will get the same page forever.  The input to radix_tree_deref_slot
> never changes so I assume the output should be the same crappy page with
> zero _count that drops me on the goto repeat line.
> 
> Is this a bug?
> 
> Also, is having a page->_count == 0 an unexpected or invalid state?
> 
> Thanks!
> 
> --mgross
> 
> 
> 
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>


-- 
Kinds Regards
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible bug in find_get_pages
  2009-03-06 23:47 ` Minchan Kim
@ 2009-03-09 16:43   ` mark gross
  2009-03-10 10:45     ` Nick Piggin
  0 siblings, 1 reply; 9+ messages in thread
From: mark gross @ 2009-03-09 16:43 UTC (permalink / raw)
  To: Minchan Kim; +Cc: linux-mm, Nick Piggin, Christoph Lameter

On Sat, Mar 07, 2009 at 08:47:32AM +0900, Minchan Kim wrote:
> Nick already found and solved this problem .
> It can help you. 
> 
> http://patchwork.kernel.org/patch/860/
> 

Wow, this reads just like the problem we are seeing.  I'll try the
patch and let the test run for a few days!

We've even see it come out of the live lock once in a while as well.  I
was thinking cache coherency HW issue until this :)

I'll send an update after running the test.

thanks!

--mgross


> 
> > On Fri, 6 Mar 2009 11:26:25 -0800
> > mark gross <mgross@linux.intel.com> wrote:
> >
> > I'm looking at a system hang (note: new hardware going under stress
> > tests using a ubuntu 2.6.27-11-generic)
> > 
> > It seems that page->_count == 0 at some point on some overnight runs
> > with locks the system into a tight loop from the repeat: and a goto
> > repeat in find_get_pages. 
> > 
> > Code inserted for convenience:
> > 
> > unsigned find_get_pages(struct address_space *mapping, pgoff_t start,
> > 			    unsigned int nr_pages, struct page **pages)
> > {
> > 	unsigned int i;
> > 	unsigned int ret;
> > 	unsigned int nr_found;
> > 
> > 	rcu_read_lock();
> > restart:
> > 	nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
> > 				(void ***)pages, start, nr_pages);
> > 	ret = 0;
> > 	for (i = 0; i < nr_found; i++) {
> > 		struct page *page;
> > repeat:
> > 		page = radix_tree_deref_slot((void **)pages[i]);
> > 		if (unlikely(!page))
> > 			continue;
> > 		/*
> > 		 * this can only trigger if nr_found == 1, making
> > 		 * livelock
> > 		 * a non issue.
> > 		 */
> > 		if (unlikely(page == RADIX_TREE_RETRY))
> > 			goto restart;
> > 
> > 		if (!page_cache_get_speculative(page))
> > 			goto repeat; <---------_always_hits_ 
> > 
> > 		/* Has the page moved? */
> > 		if (unlikely(page != *((void **)pages[i]))) {
> > 			page_cache_release(page);
> > 			goto repeat;
> > 		}
> > 
> > 		pages[ret] = page;
> > 		ret++;
> > 	}
> > 	rcu_read_unlock();
> > 	return ret;
> > }
> > 
> > My question is that as I look at this code I don't see any way out of it
> > once I get a page with zero _count from radix_tree_deref_slot, then I
> > will get the same page forever.  The input to radix_tree_deref_slot
> > never changes so I assume the output should be the same crappy page with
> > zero _count that drops me on the goto repeat line.
> > 
> > Is this a bug?
> > 
> > Also, is having a page->_count == 0 an unexpected or invalid state?
> > 
> > Thanks!
> > 
> > --mgross
> > 
> > 
> > 
> > 
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> 
> -- 
> Kinds Regards
> Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible bug in find_get_pages
  2009-03-09 16:43   ` mark gross
@ 2009-03-10 10:45     ` Nick Piggin
  2009-03-10 22:49       ` mark gross
  0 siblings, 1 reply; 9+ messages in thread
From: Nick Piggin @ 2009-03-10 10:45 UTC (permalink / raw)
  To: mark gross; +Cc: Minchan Kim, linux-mm, Christoph Lameter

On Mon, Mar 09, 2009 at 09:43:16AM -0700, mark gross wrote:
> On Sat, Mar 07, 2009 at 08:47:32AM +0900, Minchan Kim wrote:
> > Nick already found and solved this problem .
> > It can help you. 
> > 
> > http://patchwork.kernel.org/patch/860/
> > 
> 
> Wow, this reads just like the problem we are seeing.  I'll try the
> patch and let the test run for a few days!
> 
> We've even see it come out of the live lock once in a while as well.  I
> was thinking cache coherency HW issue until this :)
> 
> I'll send an update after running the test.

Note that after some discussion, the accepted fix looks a bit
different (and might potentially fix another problem if the compiler
gets very smart, although gcc doesn't seem to).

Git commit e8c82c2e23e3527e0c9dc195e432c16784d270fa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible bug in find_get_pages
  2009-03-06 19:28 ` Christoph Lameter
  2009-03-06 21:13   ` mark gross
@ 2009-03-10 10:49   ` Nick Piggin
  1 sibling, 0 replies; 9+ messages in thread
From: Nick Piggin @ 2009-03-10 10:49 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: mark gross, linux-mm

On Fri, Mar 06, 2009 at 02:28:50PM -0500, Christoph Lameter wrote:
> On Fri, 6 Mar 2009, mark gross wrote:
> 
> > It seems that page->_count == 0 at some point on some overnight runs
> > with locks the system into a tight loop from the repeat: and a goto
> > repeat in find_get_pages.
> 
> A page with ref count zero should not be in any mapping. If the page is in
> a mapping then the page is used. Therefore the refcount should be > 0.
> 
> If there is a page with zero refcount and its in a mapping then something
> erroneously decreased the refcount.

Just for posterity, this isn't _quite_ true any more with Hugh's
variation to the speculative reference method. We now in some
places set the page's refcount to 0 in order to hold off new
speculative references from turning into real references (eg. right
before final checks before page reclaim).

But yes, such a page should not remain both in a mapping and with a
0 refcount for long periods.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible bug in find_get_pages
  2009-03-10 10:45     ` Nick Piggin
@ 2009-03-10 22:49       ` mark gross
  0 siblings, 0 replies; 9+ messages in thread
From: mark gross @ 2009-03-10 22:49 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Minchan Kim, linux-mm, Christoph Lameter

On Tue, Mar 10, 2009 at 11:45:52AM +0100, Nick Piggin wrote:
> On Mon, Mar 09, 2009 at 09:43:16AM -0700, mark gross wrote:
> > On Sat, Mar 07, 2009 at 08:47:32AM +0900, Minchan Kim wrote:
> > > Nick already found and solved this problem .
> > > It can help you. 
> > > 
> > > http://patchwork.kernel.org/patch/860/
> > > 
> > 
> > Wow, this reads just like the problem we are seeing.  I'll try the
> > patch and let the test run for a few days!
> > 
> > We've even see it come out of the live lock once in a while as well.  I
> > was thinking cache coherency HW issue until this :)
> > 
> > I'll send an update after running the test.
> 
> Note that after some discussion, the accepted fix looks a bit
> different (and might potentially fix another problem if the compiler
> gets very smart, although gcc doesn't seem to).
> 
> Git commit e8c82c2e23e3527e0c9dc195e432c16784d270fa

Yes, we are testing with this one liner fix, 30rhs and counting.  Its
looking pretty good.

thanks!

--mgross

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-03-10 22:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-06 19:26 possible bug in find_get_pages mark gross
2009-03-06 19:28 ` Christoph Lameter
2009-03-06 21:13   ` mark gross
2009-03-06 21:29     ` Christoph Lameter
2009-03-10 10:49   ` Nick Piggin
2009-03-06 23:47 ` Minchan Kim
2009-03-09 16:43   ` mark gross
2009-03-10 10:45     ` Nick Piggin
2009-03-10 22:49       ` mark gross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).