All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Hugh Dickins <hughd@google.com>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>
Subject: Re: [PATCH RFC 00/15] mm: memory book keeping and lru_lock splitting
Date: Thu, 16 Feb 2012 09:43:52 +0400	[thread overview]
Message-ID: <4F3C9798.7050800@openvz.org> (raw)
In-Reply-To: <20120216110408.f35c3448.kamezawa.hiroyu@jp.fujitsu.com>

KAMEZAWA Hiroyuki wrote:
> On Thu, 16 Feb 2012 02:57:04 +0400
> Konstantin Khlebnikov<khlebnikov@openvz.org>  wrote:
>
>> There should be no logic changes in this patchset, this is only tossing bits around.
>> [ This patchset is on top some memcg cleanup/rework patches,
>>    which I sent to linux-mm@ today/yesterday ]
>>
>> Most of things in this patchset are self-descriptive, so here brief plan:
>>
>
> AFAIK, Hugh Dickins said he has per-zone-per-lru-lock and is testing it.
> So, please CC him and Johannes, at least.
>

Ok

>
>> * Transmute struct lruvec into struct book. Like real book this struct will
>>    store set of pages for one zone. It will be working unit for reclaimer code.
>> [ If memcg is disabled in config there will only one book embedded into struct zone ]
>>
>
> Why you need to add new structure rahter than enhancing lruvec ?
> "book" means a binder of pages ?
>

I responded to this in the reply to Hugh Dickins.

>
>> * move page-lru counters to struct book
>> [ this adds extra overhead in add_page_to_lru_list()/del_page_from_lru_list() for
>>    non-memcg case, but I believe it will be invisible, only one non-atomic add/sub
>>    in the same cacheline with lru list ]
>>
>
> This seems straightforward.
>
>> * unify inactive_list_is_low_global() and cleanup reclaimer code
>> * replace struct mem_cgroup_zone with single pointer to struct book
>
> Hm, ok.
>
>> * optimize page to book translations, move it upper in the call stack,
>>    replace some struct zone arguments with struct book pointer.
>>
>
> a page->book transrater from patch 2/15
>
> +struct book *page_book(struct page *page)
> +{
> +	struct mem_cgroup_per_zone *mz;
> +	struct page_cgroup *pc;
> +
> +	if (mem_cgroup_disabled())
> +		return&page_zone(page)->book;
> +
> +	pc = lookup_page_cgroup(page);
> +	if (!PageCgroupUsed(pc))
> +		return&page_zone(page)->book;
> +	/* Ensure pc->mem_cgroup is visible after reading PCG_USED. */
> +	smp_rmb();
> +	mz = mem_cgroup_zoneinfo(pc->mem_cgroup,
> +			page_to_nid(page), page_zonenum(page));
> +	return&mz->book;
> +}
>
> What happens when pc->mem_cgroup is rewritten by move_account() ?
> Where is the guard for lockless access of this ?

Initially this suppose to be protected with lru_lock, in final patch they are protected with rcu.
After final patch all page_book() calls are collected in [__re]lock_page_book[_irq]() functions.
They pick some book reference, lock its lru and recheck page -> book reference in loop till success.

Currently I found there only one potential problem: free_mem_cgroup_per_zone_info() in "mm: memory bookkeeping core"
maybe should call spin_unlock_wait(&zone->lru_lock), because some guy can pick page_book(pfn_to_page(pfn))
and try to isolate this page. But I not sure, how this is possible. In final patch it is totally fixed with rcu.

>
> Thanks,
> -Kame
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Hugh Dickins <hughd@google.com>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>
Subject: Re: [PATCH RFC 00/15] mm: memory book keeping and lru_lock splitting
Date: Thu, 16 Feb 2012 09:43:52 +0400	[thread overview]
Message-ID: <4F3C9798.7050800@openvz.org> (raw)
In-Reply-To: <20120216110408.f35c3448.kamezawa.hiroyu@jp.fujitsu.com>

KAMEZAWA Hiroyuki wrote:
> On Thu, 16 Feb 2012 02:57:04 +0400
> Konstantin Khlebnikov<khlebnikov@openvz.org>  wrote:
>
>> There should be no logic changes in this patchset, this is only tossing bits around.
>> [ This patchset is on top some memcg cleanup/rework patches,
>>    which I sent to linux-mm@ today/yesterday ]
>>
>> Most of things in this patchset are self-descriptive, so here brief plan:
>>
>
> AFAIK, Hugh Dickins said he has per-zone-per-lru-lock and is testing it.
> So, please CC him and Johannes, at least.
>

Ok

>
>> * Transmute struct lruvec into struct book. Like real book this struct will
>>    store set of pages for one zone. It will be working unit for reclaimer code.
>> [ If memcg is disabled in config there will only one book embedded into struct zone ]
>>
>
> Why you need to add new structure rahter than enhancing lruvec ?
> "book" means a binder of pages ?
>

I responded to this in the reply to Hugh Dickins.

>
>> * move page-lru counters to struct book
>> [ this adds extra overhead in add_page_to_lru_list()/del_page_from_lru_list() for
>>    non-memcg case, but I believe it will be invisible, only one non-atomic add/sub
>>    in the same cacheline with lru list ]
>>
>
> This seems straightforward.
>
>> * unify inactive_list_is_low_global() and cleanup reclaimer code
>> * replace struct mem_cgroup_zone with single pointer to struct book
>
> Hm, ok.
>
>> * optimize page to book translations, move it upper in the call stack,
>>    replace some struct zone arguments with struct book pointer.
>>
>
> a page->book transrater from patch 2/15
>
> +struct book *page_book(struct page *page)
> +{
> +	struct mem_cgroup_per_zone *mz;
> +	struct page_cgroup *pc;
> +
> +	if (mem_cgroup_disabled())
> +		return&page_zone(page)->book;
> +
> +	pc = lookup_page_cgroup(page);
> +	if (!PageCgroupUsed(pc))
> +		return&page_zone(page)->book;
> +	/* Ensure pc->mem_cgroup is visible after reading PCG_USED. */
> +	smp_rmb();
> +	mz = mem_cgroup_zoneinfo(pc->mem_cgroup,
> +			page_to_nid(page), page_zonenum(page));
> +	return&mz->book;
> +}
>
> What happens when pc->mem_cgroup is rewritten by move_account() ?
> Where is the guard for lockless access of this ?

Initially this suppose to be protected with lru_lock, in final patch they are protected with rcu.
After final patch all page_book() calls are collected in [__re]lock_page_book[_irq]() functions.
They pick some book reference, lock its lru and recheck page -> book reference in loop till success.

Currently I found there only one potential problem: free_mem_cgroup_per_zone_info() in "mm: memory bookkeeping core"
maybe should call spin_unlock_wait(&zone->lru_lock), because some guy can pick page_book(pfn_to_page(pfn))
and try to isolate this page. But I not sure, how this is possible. In final patch it is totally fixed with rcu.

>
> Thanks,
> -Kame
>


  reply	other threads:[~2012-02-16  5:43 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-15 22:57 [PATCH RFC 00/15] mm: memory book keeping and lru_lock splitting Konstantin Khlebnikov
2012-02-15 22:57 ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 01/15] mm: rename struct lruvec into struct book Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 02/15] mm: memory bookkeeping core Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 03/15] mm: add book->pages_count Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 04/15] mm: unify inactive_list_is_low() Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 05/15] mm: add book->reclaim_stat Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 06/15] mm: kill struct mem_cgroup_zone Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 07/15] mm: move page-to-book translation upper Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 08/15] mm: introduce book locking primitives Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 09/15] mm: handle book relocks on lumpy reclaim Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 10/15] mm: handle book relocks in compaction Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 11/15] mm: handle book relock in memory controller Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 12/15] mm: optimize books in update_page_reclaim_stat() Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 13/15] mm: optimize books in pagevec_lru_move_fn() Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:57 ` [PATCH RFC 14/15] mm: optimize putback for 0-order reclaim Konstantin Khlebnikov
2012-02-15 22:57   ` Konstantin Khlebnikov
2012-02-15 22:58 ` [PATCH RFC 15/15] mm: split zone->lru_lock Konstantin Khlebnikov
2012-02-15 22:58   ` Konstantin Khlebnikov
2012-02-16  2:04 ` [PATCH RFC 00/15] mm: memory book keeping and lru_lock splitting KAMEZAWA Hiroyuki
2012-02-16  2:04   ` KAMEZAWA Hiroyuki
2012-02-16  5:43   ` Konstantin Khlebnikov [this message]
2012-02-16  5:43     ` Konstantin Khlebnikov
2012-02-16  8:24     ` KAMEZAWA Hiroyuki
2012-02-16  8:24       ` KAMEZAWA Hiroyuki
2012-02-16 11:02       ` Konstantin Khlebnikov
2012-02-16 11:02         ` Konstantin Khlebnikov
2012-02-16 15:54         ` Konstantin Khlebnikov
2012-02-16 15:54           ` Konstantin Khlebnikov
2012-02-16 23:54         ` KAMEZAWA Hiroyuki
2012-02-16 23:54           ` KAMEZAWA Hiroyuki
2012-02-18  9:09           ` Konstantin Khlebnikov
2012-02-18  9:09             ` Konstantin Khlebnikov
2012-02-16  2:37 ` Hugh Dickins
2012-02-16  2:37   ` Hugh Dickins
2012-02-16  4:51   ` Konstantin Khlebnikov
2012-02-16  4:51     ` Konstantin Khlebnikov
2012-02-16 21:37     ` Hugh Dickins
2012-02-16 21:37       ` Hugh Dickins
2012-02-17 19:56       ` Konstantin Khlebnikov
2012-02-17 19:56         ` Konstantin Khlebnikov
2012-02-18  2:13       ` Hugh Dickins
2012-02-18  2:13         ` Hugh Dickins
2012-02-18  6:35         ` Konstantin Khlebnikov
2012-02-18  6:35           ` Konstantin Khlebnikov
2012-02-18  7:14           ` Hugh Dickins
2012-02-18  7:14             ` Hugh Dickins
2012-02-20  0:32             ` KAMEZAWA Hiroyuki
2012-02-20  0:32               ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F3C9798.7050800@openvz.org \
    --to=khlebnikov@openvz.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.