From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 1/4] mm: page refcount use atomic primitives
Date: Mon, 09 Jan 2006 07:40:53 +1100 [thread overview]
Message-ID: <43C178D5.5010703@yahoo.com.au> (raw)
In-Reply-To: <20060107215413.560aa3a9.akpm@osdl.org>
Andrew Morton wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
>>The VM has an interesting race where a page refcount can drop to zero, but
>>it is still on the LRU lists for a short time. This was solved by testing
>>a 0->1 refcount transition when picking up pages from the LRU, and dropping
>>the refcount in that case.
>
>
> Tell me about it...
>
>
>>Instead, use atomic_inc_not_zero to ensure we never pick up a 0 refcount
>>page from the LRU (ie. we guarantee the page will not be touched).
>
>
> atomic_inc_not_zero() looks rather bloaty, but a single call site is OK.
>
A little. On generic cmpxchg/cas architectures it isn't too bad, and
LL/SC architectures presently implement it fairly stupidly with cmpxchg
but they can do a much better job using ll/sc directly.
>
>>This ensures we can test PageLRU without taking the lru_lock,
>
>
> Let me write some changelog for you.
>
> isolate_lru_pages() can remove live pages from the LRU at any time and
> shrink_cache() can put them back at any time. As we don't hold the
> zone->lock we can race against that.
>
>
>>void fastcall __page_cache_release(struct page *page)
>>{
>> if (PageLRU(page)) {
>> unsigned long flags;
>
>
> isolate_lru_pages() removes the page here.
>
>
>> struct zone *zone = page_zone(page);
>> spin_lock_irqsave(&zone->lru_lock, flags);
>> if (!TestClearPageLRU(page))
>> BUG();
>
>
> blam.
>
>
>> del_page_from_lru(zone, page);
>> spin_unlock_irqrestore(&zone->lru_lock, flags);
>> }
>>
>> BUG_ON(page_count(page) != 0);
>> free_hot_page(page);
>>}
>>
>
>
> But put_page() wouldn't have entered __page_cache_release() at all, because
> isolate_lru_page() is changed by this patch to elevated the page refcount
> prior to clearing PG_lru:
>
> BUG_ON(!PageLRU(page));
> list_del(&page->lru);
> target = src;
> if (get_page_unless_zero(page)) {
> ClearPageLRU(page);
>
>
> So no blam.
>
> That's from a two-minute-peek. I haven't thought about this dreadfully
> hard. But I'd like to gain some confidence that you have, please. This
> stuff is tricky.
>
Right, no blam. This is how I avoid taking the LRU lock.
"Instead, use atomic_inc_not_zero to ensure we never **pick up a 0 refcount**
page from the LRU (ie. we guarantee the page will not be touched)."
I don't understand what you're asking?
>
>>and allows
>>further optimisations (in later patches) -- we end up saving 2 atomic ops
>>including a spin_lock_irqsave in the !PageLRU case, and 2 or 3 atomic ops
>>in the PageLRU case.
>
>
> Well yeah, but you've pretty much eliminated all those nice speedups by
> adding several BUG_ON(atomic_op)s. Everyone compiles with CONFIG_BUG. So
> I'd suggest that such new assertions be broken out into a separate -mm-only
> patch.
>
Not quite eliminated because ClearPageXXX is an atomic RMW, __ClearPageXXX
is not. Also TestClearXXX includes memory barriers.
Anyway I wanted to keep it equivalent (ie. keep bugs there). I have a future
patch to move these assertions under CONFIG_DEBUG_VM.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
WARNING: multiple messages have this Message-ID (diff)
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 1/4] mm: page refcount use atomic primitives
Date: Mon, 09 Jan 2006 07:40:53 +1100 [thread overview]
Message-ID: <43C178D5.5010703@yahoo.com.au> (raw)
In-Reply-To: <20060107215413.560aa3a9.akpm@osdl.org>
Andrew Morton wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
>>The VM has an interesting race where a page refcount can drop to zero, but
>>it is still on the LRU lists for a short time. This was solved by testing
>>a 0->1 refcount transition when picking up pages from the LRU, and dropping
>>the refcount in that case.
>
>
> Tell me about it...
>
>
>>Instead, use atomic_inc_not_zero to ensure we never pick up a 0 refcount
>>page from the LRU (ie. we guarantee the page will not be touched).
>
>
> atomic_inc_not_zero() looks rather bloaty, but a single call site is OK.
>
A little. On generic cmpxchg/cas architectures it isn't too bad, and
LL/SC architectures presently implement it fairly stupidly with cmpxchg
but they can do a much better job using ll/sc directly.
>
>>This ensures we can test PageLRU without taking the lru_lock,
>
>
> Let me write some changelog for you.
>
> isolate_lru_pages() can remove live pages from the LRU at any time and
> shrink_cache() can put them back at any time. As we don't hold the
> zone->lock we can race against that.
>
>
>>void fastcall __page_cache_release(struct page *page)
>>{
>> if (PageLRU(page)) {
>> unsigned long flags;
>
>
> isolate_lru_pages() removes the page here.
>
>
>> struct zone *zone = page_zone(page);
>> spin_lock_irqsave(&zone->lru_lock, flags);
>> if (!TestClearPageLRU(page))
>> BUG();
>
>
> blam.
>
>
>> del_page_from_lru(zone, page);
>> spin_unlock_irqrestore(&zone->lru_lock, flags);
>> }
>>
>> BUG_ON(page_count(page) != 0);
>> free_hot_page(page);
>>}
>>
>
>
> But put_page() wouldn't have entered __page_cache_release() at all, because
> isolate_lru_page() is changed by this patch to elevated the page refcount
> prior to clearing PG_lru:
>
> BUG_ON(!PageLRU(page));
> list_del(&page->lru);
> target = src;
> if (get_page_unless_zero(page)) {
> ClearPageLRU(page);
>
>
> So no blam.
>
> That's from a two-minute-peek. I haven't thought about this dreadfully
> hard. But I'd like to gain some confidence that you have, please. This
> stuff is tricky.
>
Right, no blam. This is how I avoid taking the LRU lock.
"Instead, use atomic_inc_not_zero to ensure we never **pick up a 0 refcount**
page from the LRU (ie. we guarantee the page will not be touched)."
I don't understand what you're asking?
>
>>and allows
>>further optimisations (in later patches) -- we end up saving 2 atomic ops
>>including a spin_lock_irqsave in the !PageLRU case, and 2 or 3 atomic ops
>>in the PageLRU case.
>
>
> Well yeah, but you've pretty much eliminated all those nice speedups by
> adding several BUG_ON(atomic_op)s. Everyone compiles with CONFIG_BUG. So
> I'd suggest that such new assertions be broken out into a separate -mm-only
> patch.
>
Not quite eliminated because ClearPageXXX is an atomic RMW, __ClearPageXXX
is not. Also TestClearXXX includes memory barriers.
Anyway I wanted to keep it equivalent (ie. keep bugs there). I have a future
patch to move these assertions under CONFIG_DEBUG_VM.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-01-08 20:41 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-08 5:19 [patch 0/4] mm: de-skew page_count Nick Piggin
2006-01-08 5:23 ` Nick Piggin
2006-01-08 5:20 ` [patch 1/4] mm: page refcount use atomic primitives Nick Piggin
2006-01-08 5:23 ` Nick Piggin
2006-01-08 5:54 ` Andrew Morton
2006-01-08 5:54 ` Andrew Morton
2006-01-08 20:40 ` Nick Piggin [this message]
2006-01-08 20:40 ` Nick Piggin
2006-01-08 21:43 ` Andrew Morton
2006-01-08 21:43 ` Andrew Morton
2006-01-08 5:20 ` [patch 2/4] mm: PageLRU no testset Nick Piggin
2006-01-08 5:24 ` Nick Piggin
2006-01-08 5:21 ` [patch 3/4] mm: PageActive " Nick Piggin
2006-01-08 5:24 ` Nick Piggin
2006-01-08 5:21 ` [patch 4/4] mm: less atomic ops Nick Piggin
2006-01-08 5:25 ` Nick Piggin
2006-01-08 5:42 ` [patch 0/4] mm: de-skew page_count Nick Piggin
2006-01-08 5:42 ` Nick Piggin
-- strict thread matches above, loose matches on Subject: below --
2006-01-18 10:40 [patch 0/4] mm: de-skew page refcount Nick Piggin
2006-01-18 10:40 ` [patch 1/4] mm: page refcount use atomic primitives Nick Piggin
2006-01-18 10:40 ` Nick Piggin
2005-12-05 11:59 [patch 0/4] mm: optimisations Nick Piggin
2005-12-05 20:43 ` [patch 1/4] mm: page refcount use atomic primitives Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43C178D5.5010703@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.