From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail143.messagelabs.com (mail143.messagelabs.com [216.82.254.35]) by kanga.kvack.org (Postfix) with SMTP id DD1756B011E for ; Fri, 6 Mar 2009 14:22:46 -0500 (EST) Date: Fri, 6 Mar 2009 11:26:25 -0800 From: mark gross Subject: possible bug in find_get_pages Message-ID: <20090306192625.GA3267@linux.intel.com> Reply-To: mgross@linux.intel.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-linux-mm@kvack.org To: linux-mm@kvack.org List-ID: I'm looking at a system hang (note: new hardware going under stress tests using a ubuntu 2.6.27-11-generic) It seems that page->_count == 0 at some point on some overnight runs with locks the system into a tight loop from the repeat: and a goto repeat in find_get_pages. Code inserted for convenience: unsigned find_get_pages(struct address_space *mapping, pgoff_t start, unsigned int nr_pages, struct page **pages) { unsigned int i; unsigned int ret; unsigned int nr_found; rcu_read_lock(); restart: nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree, (void ***)pages, start, nr_pages); ret = 0; for (i = 0; i < nr_found; i++) { struct page *page; repeat: page = radix_tree_deref_slot((void **)pages[i]); if (unlikely(!page)) continue; /* * this can only trigger if nr_found == 1, making * livelock * a non issue. */ if (unlikely(page == RADIX_TREE_RETRY)) goto restart; if (!page_cache_get_speculative(page)) goto repeat; <---------_always_hits_ /* Has the page moved? */ if (unlikely(page != *((void **)pages[i]))) { page_cache_release(page); goto repeat; } pages[ret] = page; ret++; } rcu_read_unlock(); return ret; } My question is that as I look at this code I don't see any way out of it once I get a page with zero _count from radix_tree_deref_slot, then I will get the same page forever. The input to radix_tree_deref_slot never changes so I assume the output should be the same crappy page with zero _count that drops me on the goto repeat line. Is this a bug? Also, is having a page->_count == 0 an unexpected or invalid state? Thanks! --mgross -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail202.messagelabs.com (mail202.messagelabs.com [216.82.254.227]) by kanga.kvack.org (Postfix) with SMTP id 8A2B46B0123 for ; Fri, 6 Mar 2009 14:39:35 -0500 (EST) Received: from localhost (smtp.ultrahosting.com [127.0.0.1]) by smtp.ultrahosting.com (Postfix) with ESMTP id A804782D7C2 for ; Fri, 6 Mar 2009 14:45:10 -0500 (EST) Received: from smtp.ultrahosting.com ([74.213.175.254]) by localhost (smtp.ultrahosting.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0U8MM5It+m9F for ; Fri, 6 Mar 2009 14:45:06 -0500 (EST) Received: from qirst.com (unknown [74.213.171.31]) by smtp.ultrahosting.com (Postfix) with ESMTP id D15C982D7B6 for ; Fri, 6 Mar 2009 14:45:02 -0500 (EST) Date: Fri, 6 Mar 2009 14:28:50 -0500 (EST) From: Christoph Lameter Subject: Re: possible bug in find_get_pages In-Reply-To: <20090306192625.GA3267@linux.intel.com> Message-ID: References: <20090306192625.GA3267@linux.intel.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org To: mark gross Cc: linux-mm@kvack.org, npiggin@suse.de List-ID: On Fri, 6 Mar 2009, mark gross wrote: > It seems that page->_count == 0 at some point on some overnight runs > with locks the system into a tight loop from the repeat: and a goto > repeat in find_get_pages. A page with ref count zero should not be in any mapping. If the page is in a mapping then the page is used. Therefore the refcount should be > 0. If there is a page with zero refcount and its in a mapping then something erroneously decreased the refcount. Nick wrote the code so I CCed him. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail144.messagelabs.com (mail144.messagelabs.com [216.82.254.51]) by kanga.kvack.org (Postfix) with SMTP id 32C0A6B0083 for ; Fri, 6 Mar 2009 16:10:13 -0500 (EST) Date: Fri, 6 Mar 2009 13:13:36 -0800 From: mark gross Subject: Re: possible bug in find_get_pages Message-ID: <20090306211336.GA5981@linux.intel.com> Reply-To: mgross@linux.intel.com References: <20090306192625.GA3267@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org To: Christoph Lameter Cc: linux-mm@kvack.org, npiggin@suse.de List-ID: On Fri, Mar 06, 2009 at 02:28:50PM -0500, Christoph Lameter wrote: > On Fri, 6 Mar 2009, mark gross wrote: > > > It seems that page->_count == 0 at some point on some overnight runs > > with locks the system into a tight loop from the repeat: and a goto > > repeat in find_get_pages. > > A page with ref count zero should not be in any mapping. If the page is in > a mapping then the page is used. Therefore the refcount should be > 0. > > If there is a page with zero refcount and its in a mapping then something > erroneously decreased the refcount. > > Nick wrote the code so I CCed him. thanks! This is on early hardware so perhaps there isn't anything to see here. Still form a static read of the code that goto repeat raises eyebrows as why would anyone expect to get anything different from radix_page_deref_slot calling it again with the same arguments? --mgross -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail191.messagelabs.com (mail191.messagelabs.com [216.82.242.19]) by kanga.kvack.org (Postfix) with SMTP id 5471E6B008A for ; Fri, 6 Mar 2009 16:42:02 -0500 (EST) Received: from localhost (smtp.ultrahosting.com [127.0.0.1]) by smtp.ultrahosting.com (Postfix) with ESMTP id 40C0F82D879 for ; Fri, 6 Mar 2009 16:47:40 -0500 (EST) Received: from smtp.ultrahosting.com ([74.213.175.254]) by localhost (smtp.ultrahosting.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4rKtM1SdHtCl for ; Fri, 6 Mar 2009 16:47:35 -0500 (EST) Received: from qirst.com (unknown [74.213.171.31]) by smtp.ultrahosting.com (Postfix) with ESMTP id 1BB5F82D875 for ; Fri, 6 Mar 2009 16:45:12 -0500 (EST) Date: Fri, 6 Mar 2009 16:29:23 -0500 (EST) From: Christoph Lameter Subject: Re: possible bug in find_get_pages In-Reply-To: <20090306211336.GA5981@linux.intel.com> Message-ID: References: <20090306192625.GA3267@linux.intel.com> <20090306211336.GA5981@linux.intel.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org To: mark gross Cc: linux-mm@kvack.org, npiggin@suse.de List-ID: On Fri, 6 Mar 2009, mark gross wrote: > Still form a static read of the code that goto repeat raises > eyebrows as why would anyone expect to get anything different from > radix_page_deref_slot calling it again with the same arguments? Another processor may be updating the same structure. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with SMTP id 4CE246B004F for ; Fri, 6 Mar 2009 18:49:49 -0500 (EST) Received: by ti-out-0910.google.com with SMTP id u3so347688tia.8 for ; Fri, 06 Mar 2009 15:49:46 -0800 (PST) Date: Sat, 7 Mar 2009 08:47:32 +0900 From: Minchan Kim Subject: Re: possible bug in find_get_pages Message-Id: <20090307084732.b01bcfee.minchan.kim@barrios-desktop> In-Reply-To: <20090306192625.GA3267@linux.intel.com> References: <20090306192625.GA3267@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org To: mgross@linux.intel.com Cc: linux-mm@kvack.org, Nick Piggin , Christoph Lameter List-ID: Nick already found and solved this problem . It can help you. http://patchwork.kernel.org/patch/860/ > On Fri, 6 Mar 2009 11:26:25 -0800 > mark gross wrote: > > I'm looking at a system hang (note: new hardware going under stress > tests using a ubuntu 2.6.27-11-generic) > > It seems that page->_count == 0 at some point on some overnight runs > with locks the system into a tight loop from the repeat: and a goto > repeat in find_get_pages. > > Code inserted for convenience: > > unsigned find_get_pages(struct address_space *mapping, pgoff_t start, > unsigned int nr_pages, struct page **pages) > { > unsigned int i; > unsigned int ret; > unsigned int nr_found; > > rcu_read_lock(); > restart: > nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree, > (void ***)pages, start, nr_pages); > ret = 0; > for (i = 0; i < nr_found; i++) { > struct page *page; > repeat: > page = radix_tree_deref_slot((void **)pages[i]); > if (unlikely(!page)) > continue; > /* > * this can only trigger if nr_found == 1, making > * livelock > * a non issue. > */ > if (unlikely(page == RADIX_TREE_RETRY)) > goto restart; > > if (!page_cache_get_speculative(page)) > goto repeat; <---------_always_hits_ > > /* Has the page moved? */ > if (unlikely(page != *((void **)pages[i]))) { > page_cache_release(page); > goto repeat; > } > > pages[ret] = page; > ret++; > } > rcu_read_unlock(); > return ret; > } > > My question is that as I look at this code I don't see any way out of it > once I get a page with zero _count from radix_tree_deref_slot, then I > will get the same page forever. The input to radix_tree_deref_slot > never changes so I assume the output should be the same crappy page with > zero _count that drops me on the goto repeat line. > > Is this a bug? > > Also, is having a page->_count == 0 an unexpected or invalid state? > > Thanks! > > --mgross > > > > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- Kinds Regards Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail191.messagelabs.com (mail191.messagelabs.com [216.82.242.19]) by kanga.kvack.org (Postfix) with SMTP id 9327B6B003D for ; Mon, 9 Mar 2009 12:39:35 -0400 (EDT) Date: Mon, 9 Mar 2009 09:43:16 -0700 From: mark gross Subject: Re: possible bug in find_get_pages Message-ID: <20090309164316.GB31140@linux.intel.com> Reply-To: mgross@linux.intel.com References: <20090306192625.GA3267@linux.intel.com> <20090307084732.b01bcfee.minchan.kim@barrios-desktop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090307084732.b01bcfee.minchan.kim@barrios-desktop> Sender: owner-linux-mm@kvack.org To: Minchan Kim Cc: linux-mm@kvack.org, Nick Piggin , Christoph Lameter List-ID: On Sat, Mar 07, 2009 at 08:47:32AM +0900, Minchan Kim wrote: > Nick already found and solved this problem . > It can help you. > > http://patchwork.kernel.org/patch/860/ > Wow, this reads just like the problem we are seeing. I'll try the patch and let the test run for a few days! We've even see it come out of the live lock once in a while as well. I was thinking cache coherency HW issue until this :) I'll send an update after running the test. thanks! --mgross > > > On Fri, 6 Mar 2009 11:26:25 -0800 > > mark gross wrote: > > > > I'm looking at a system hang (note: new hardware going under stress > > tests using a ubuntu 2.6.27-11-generic) > > > > It seems that page->_count == 0 at some point on some overnight runs > > with locks the system into a tight loop from the repeat: and a goto > > repeat in find_get_pages. > > > > Code inserted for convenience: > > > > unsigned find_get_pages(struct address_space *mapping, pgoff_t start, > > unsigned int nr_pages, struct page **pages) > > { > > unsigned int i; > > unsigned int ret; > > unsigned int nr_found; > > > > rcu_read_lock(); > > restart: > > nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree, > > (void ***)pages, start, nr_pages); > > ret = 0; > > for (i = 0; i < nr_found; i++) { > > struct page *page; > > repeat: > > page = radix_tree_deref_slot((void **)pages[i]); > > if (unlikely(!page)) > > continue; > > /* > > * this can only trigger if nr_found == 1, making > > * livelock > > * a non issue. > > */ > > if (unlikely(page == RADIX_TREE_RETRY)) > > goto restart; > > > > if (!page_cache_get_speculative(page)) > > goto repeat; <---------_always_hits_ > > > > /* Has the page moved? */ > > if (unlikely(page != *((void **)pages[i]))) { > > page_cache_release(page); > > goto repeat; > > } > > > > pages[ret] = page; > > ret++; > > } > > rcu_read_unlock(); > > return ret; > > } > > > > My question is that as I look at this code I don't see any way out of it > > once I get a page with zero _count from radix_tree_deref_slot, then I > > will get the same page forever. The input to radix_tree_deref_slot > > never changes so I assume the output should be the same crappy page with > > zero _count that drops me on the goto repeat line. > > > > Is this a bug? > > > > Also, is having a page->_count == 0 an unexpected or invalid state? > > > > Thanks! > > > > --mgross > > > > > > > > > > > > -- > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > the body to majordomo@kvack.org. For more info on Linux MM, > > see: http://www.linux-mm.org/ . > > Don't email: email@kvack.org > > > -- > Kinds Regards > Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail137.messagelabs.com (mail137.messagelabs.com [216.82.249.19]) by kanga.kvack.org (Postfix) with ESMTP id DF5EC6B003D for ; Tue, 10 Mar 2009 06:45:56 -0400 (EDT) Date: Tue, 10 Mar 2009 11:45:52 +0100 From: Nick Piggin Subject: Re: possible bug in find_get_pages Message-ID: <20090310104552.GA4594@wotan.suse.de> References: <20090306192625.GA3267@linux.intel.com> <20090307084732.b01bcfee.minchan.kim@barrios-desktop> <20090309164316.GB31140@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090309164316.GB31140@linux.intel.com> Sender: owner-linux-mm@kvack.org To: mark gross Cc: Minchan Kim , linux-mm@kvack.org, Christoph Lameter List-ID: On Mon, Mar 09, 2009 at 09:43:16AM -0700, mark gross wrote: > On Sat, Mar 07, 2009 at 08:47:32AM +0900, Minchan Kim wrote: > > Nick already found and solved this problem . > > It can help you. > > > > http://patchwork.kernel.org/patch/860/ > > > > Wow, this reads just like the problem we are seeing. I'll try the > patch and let the test run for a few days! > > We've even see it come out of the live lock once in a while as well. I > was thinking cache coherency HW issue until this :) > > I'll send an update after running the test. Note that after some discussion, the accepted fix looks a bit different (and might potentially fix another problem if the compiler gets very smart, although gcc doesn't seem to). Git commit e8c82c2e23e3527e0c9dc195e432c16784d270fa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with ESMTP id 7062B6B0047 for ; Tue, 10 Mar 2009 06:49:50 -0400 (EDT) Date: Tue, 10 Mar 2009 11:49:47 +0100 From: Nick Piggin Subject: Re: possible bug in find_get_pages Message-ID: <20090310104947.GB4594@wotan.suse.de> References: <20090306192625.GA3267@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org To: Christoph Lameter Cc: mark gross , linux-mm@kvack.org List-ID: On Fri, Mar 06, 2009 at 02:28:50PM -0500, Christoph Lameter wrote: > On Fri, 6 Mar 2009, mark gross wrote: > > > It seems that page->_count == 0 at some point on some overnight runs > > with locks the system into a tight loop from the repeat: and a goto > > repeat in find_get_pages. > > A page with ref count zero should not be in any mapping. If the page is in > a mapping then the page is used. Therefore the refcount should be > 0. > > If there is a page with zero refcount and its in a mapping then something > erroneously decreased the refcount. Just for posterity, this isn't _quite_ true any more with Hugh's variation to the speculative reference method. We now in some places set the page's refcount to 0 in order to hold off new speculative references from turning into real references (eg. right before final checks before page reclaim). But yes, such a page should not remain both in a mapping and with a 0 refcount for long periods. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail190.messagelabs.com (mail190.messagelabs.com [216.82.249.51]) by kanga.kvack.org (Postfix) with SMTP id 1F5706B003D for ; Tue, 10 Mar 2009 18:45:40 -0400 (EDT) Date: Tue, 10 Mar 2009 15:49:11 -0700 From: mark gross Subject: Re: possible bug in find_get_pages Message-ID: <20090310224911.GA16630@linux.intel.com> Reply-To: mgross@linux.intel.com References: <20090306192625.GA3267@linux.intel.com> <20090307084732.b01bcfee.minchan.kim@barrios-desktop> <20090309164316.GB31140@linux.intel.com> <20090310104552.GA4594@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090310104552.GA4594@wotan.suse.de> Sender: owner-linux-mm@kvack.org To: Nick Piggin Cc: Minchan Kim , linux-mm@kvack.org, Christoph Lameter List-ID: On Tue, Mar 10, 2009 at 11:45:52AM +0100, Nick Piggin wrote: > On Mon, Mar 09, 2009 at 09:43:16AM -0700, mark gross wrote: > > On Sat, Mar 07, 2009 at 08:47:32AM +0900, Minchan Kim wrote: > > > Nick already found and solved this problem . > > > It can help you. > > > > > > http://patchwork.kernel.org/patch/860/ > > > > > > > Wow, this reads just like the problem we are seeing. I'll try the > > patch and let the test run for a few days! > > > > We've even see it come out of the live lock once in a while as well. I > > was thinking cache coherency HW issue until this :) > > > > I'll send an update after running the test. > > Note that after some discussion, the accepted fix looks a bit > different (and might potentially fix another problem if the compiler > gets very smart, although gcc doesn't seem to). > > Git commit e8c82c2e23e3527e0c9dc195e432c16784d270fa Yes, we are testing with this one liner fix, 30rhs and counting. Its looking pretty good. thanks! --mgross -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org