Re: [PATCH 6/10] mm/memcg: take care over pc->mem_cgroup

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: Hugh Dickins <hughd@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Ying Han <yinghan@google.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 6/10] mm/memcg: take care over pc->mem_cgroup
Date: Wed, 22 Feb 2012 08:05:49 +0400	[thread overview]
Message-ID: <4F44699D.3020309@openvz.org> (raw)
In-Reply-To: <alpine.LSU.2.00.1202211437140.2012@eggly.anvils>

Hugh Dickins wrote:
> On Tue, 21 Feb 2012, KAMEZAWA Hiroyuki wrote:
>> On Mon, 20 Feb 2012 15:34:28 -0800 (PST)
>> Hugh Dickins<hughd@google.com>  wrote:
>> 	return NULL;
>>>
>>> +	lruvec = page_lock_lruvec(page);
>>>   	lock_page_cgroup(pc);
>>>
>>
>> Do we need to take lrulock+irq disable per page in this very very hot path ?
>
> I'm sure we don't want to: I hope you were pleased to find it goes away
> (from most cases) a couple of patches later.
>
> I had lruvec lock nested inside page_cgroup lock in the rollup I sent in
> December, whereas you went for page_cgroup lock nested inside lruvec lock
> in your lrucare patch.
>
> I couldn't find an imperative reason why they should be one way round or
> the other, so I tried hard to stick with your ordering, and it did work
> (in this 6/10).  But then I couldn't work out how to get rid of the
> overheads added in doing it this way round, so swapped them back.
>
>>
>> Hmm.... How about adding NR_ISOLATED counter into lruvec ?
>>
>> Then, we can delay freeing lruvec until all conunters goes down to zero.
>> as...
>>
>> 	bool we_can_free_lruvec = true;
>>
>> 	lock_lruvec(lruvec->lock);
>> 	for_each_lru_lruvec(lru)
>> 		if (!list_empty(&lruvec->lru[lru]))
>> 			we_can_free_lruvec = false;
>> 	if (lruvec->nr_isolated)
>> 		we_can_free_lruvec = false;
>> 	unlock_lruvec(lruvec)
>> 	if (we_can_free_lruvec)
>> 		kfree(lruvec);
>>
>> If compaction, lumpy reclaim free a page taken from LRU,
>> it knows what it does and can decrement lruvec->nr_isolated properly
>> (it seems zone's NR_ISOLATED is decremented at putback.)
>
> At the moment I'm thinking that what we end up with by 9/10 is
> better than adding such a refcount.  But I'm not entirely happy with
> mem_cgroup_reset_uncharged_to_root (it adds a further page_cgroup
> lookup just after I got rid of some others), and need yet to think
> about the race which Konstantin posits, so all options remain open.

This lruvec->nr_isolated seem reasonable, and its managegin not very costly.
In move_account() we anyway need to touch old_lruvec->lru_lock after recharge,
to stabilize PageLRU() before adding page to new_lruvec. (because that race)
In migration/compaction this handled automatically, because they always call putback_lru_page() at the end.
Main problem is shrink_page_list() for lumpy-reclaim, but seems like it never used if memory
compaction is enabled, so it can be slow and ineffective with tons of lru_list relocks.

>
> Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: Hugh Dickins <hughd@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Ying Han <yinghan@google.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 6/10] mm/memcg: take care over pc->mem_cgroup
Date: Wed, 22 Feb 2012 08:05:49 +0400	[thread overview]
Message-ID: <4F44699D.3020309@openvz.org> (raw)
In-Reply-To: <alpine.LSU.2.00.1202211437140.2012@eggly.anvils>

Hugh Dickins wrote:
> On Tue, 21 Feb 2012, KAMEZAWA Hiroyuki wrote:
>> On Mon, 20 Feb 2012 15:34:28 -0800 (PST)
>> Hugh Dickins<hughd@google.com>  wrote:
>> 	return NULL;
>>>
>>> +	lruvec = page_lock_lruvec(page);
>>>   	lock_page_cgroup(pc);
>>>
>>
>> Do we need to take lrulock+irq disable per page in this very very hot path ?
>
> I'm sure we don't want to: I hope you were pleased to find it goes away
> (from most cases) a couple of patches later.
>
> I had lruvec lock nested inside page_cgroup lock in the rollup I sent in
> December, whereas you went for page_cgroup lock nested inside lruvec lock
> in your lrucare patch.
>
> I couldn't find an imperative reason why they should be one way round or
> the other, so I tried hard to stick with your ordering, and it did work
> (in this 6/10).  But then I couldn't work out how to get rid of the
> overheads added in doing it this way round, so swapped them back.
>
>>
>> Hmm.... How about adding NR_ISOLATED counter into lruvec ?
>>
>> Then, we can delay freeing lruvec until all conunters goes down to zero.
>> as...
>>
>> 	bool we_can_free_lruvec = true;
>>
>> 	lock_lruvec(lruvec->lock);
>> 	for_each_lru_lruvec(lru)
>> 		if (!list_empty(&lruvec->lru[lru]))
>> 			we_can_free_lruvec = false;
>> 	if (lruvec->nr_isolated)
>> 		we_can_free_lruvec = false;
>> 	unlock_lruvec(lruvec)
>> 	if (we_can_free_lruvec)
>> 		kfree(lruvec);
>>
>> If compaction, lumpy reclaim free a page taken from LRU,
>> it knows what it does and can decrement lruvec->nr_isolated properly
>> (it seems zone's NR_ISOLATED is decremented at putback.)
>
> At the moment I'm thinking that what we end up with by 9/10 is
> better than adding such a refcount.  But I'm not entirely happy with
> mem_cgroup_reset_uncharged_to_root (it adds a further page_cgroup
> lookup just after I got rid of some others), and need yet to think
> about the race which Konstantin posits, so all options remain open.

This lruvec->nr_isolated seem reasonable, and its managegin not very costly.
In move_account() we anyway need to touch old_lruvec->lru_lock after recharge,
to stabilize PageLRU() before adding page to new_lruvec. (because that race)
In migration/compaction this handled automatically, because they always call putback_lru_page() at the end.
Main problem is shrink_page_list() for lumpy-reclaim, but seems like it never used if memory
compaction is enabled, so it can be slow and ineffective with tons of lru_list relocks.

>
> Hugh

next prev parent reply	other threads:[~2012-02-22  4:05 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-20 23:26 [PATCH 0/10] mm/memcg: per-memcg per-zone lru locking Hugh Dickins
2012-02-20 23:26 ` Hugh Dickins
2012-02-20 23:28 ` [PATCH 1/10] mm/memcg: scanning_global_lru means mem_cgroup_disabled Hugh Dickins
2012-02-20 23:28   ` Hugh Dickins
2012-02-21  8:03   ` KAMEZAWA Hiroyuki
2012-02-21  8:03     ` KAMEZAWA Hiroyuki
2012-02-20 23:29 ` [PATCH 2/10] mm/memcg: move reclaim_stat into lruvec Hugh Dickins
2012-02-20 23:29   ` Hugh Dickins
2012-02-21  8:05   ` KAMEZAWA Hiroyuki
2012-02-21  8:05     ` KAMEZAWA Hiroyuki
2012-02-20 23:30 ` [PATCH 3/10] mm/memcg: add zone pointer " Hugh Dickins
2012-02-20 23:30   ` Hugh Dickins
2012-02-21  8:08   ` KAMEZAWA Hiroyuki
2012-02-21  8:08     ` KAMEZAWA Hiroyuki
2012-02-20 23:32 ` [PATCH 4/10] mm/memcg: apply add/del_page to lruvec Hugh Dickins
2012-02-20 23:32   ` Hugh Dickins
2012-02-21  8:20   ` KAMEZAWA Hiroyuki
2012-02-21  8:20     ` KAMEZAWA Hiroyuki
2012-02-21 22:25     ` Hugh Dickins
2012-02-21 22:25       ` Hugh Dickins
2012-02-20 23:33 ` [PATCH 5/10] mm/memcg: introduce page_relock_lruvec Hugh Dickins
2012-02-20 23:33   ` Hugh Dickins
2012-02-21  8:38   ` KAMEZAWA Hiroyuki
2012-02-21  8:38     ` KAMEZAWA Hiroyuki
2012-02-21 22:36     ` Hugh Dickins
2012-02-21 22:36       ` Hugh Dickins
2012-02-20 23:34 ` [PATCH 6/10] mm/memcg: take care over pc->mem_cgroup Hugh Dickins
2012-02-20 23:34   ` Hugh Dickins
2012-02-21  5:55   ` Konstantin Khlebnikov
2012-02-21  5:55     ` Konstantin Khlebnikov
2012-02-21 19:37     ` Hugh Dickins
2012-02-21 19:37       ` Hugh Dickins
2012-02-21 20:40       ` Konstantin Khlebnikov
2012-02-21 20:40         ` Konstantin Khlebnikov
2012-02-21 22:05         ` Hugh Dickins
2012-02-21 22:05           ` Hugh Dickins
2012-02-21  6:05   ` Konstantin Khlebnikov
2012-02-21  6:05     ` Konstantin Khlebnikov
2012-02-21 20:00     ` Hugh Dickins
2012-02-21 20:00       ` Hugh Dickins
2012-02-21  9:13   ` KAMEZAWA Hiroyuki
2012-02-21  9:13     ` KAMEZAWA Hiroyuki
2012-02-21 23:03     ` Hugh Dickins
2012-02-21 23:03       ` Hugh Dickins
2012-02-22  4:05       ` Konstantin Khlebnikov [this message]
2012-02-22  4:05         ` Konstantin Khlebnikov
2012-02-20 23:35 ` [PATCH 7/10] mm/memcg: remove mem_cgroup_reset_owner Hugh Dickins
2012-02-20 23:35   ` Hugh Dickins
2012-02-21  9:17   ` KAMEZAWA Hiroyuki
2012-02-21  9:17     ` KAMEZAWA Hiroyuki
2012-02-20 23:36 ` [PATCH 8/10] mm/memcg: nest lru_lock inside page_cgroup lock Hugh Dickins
2012-02-20 23:36   ` Hugh Dickins
2012-02-21  9:48   ` KAMEZAWA Hiroyuki
2012-02-21  9:48     ` KAMEZAWA Hiroyuki
2012-02-20 23:38 ` [PATCH 9/10] mm/memcg: move lru_lock into lruvec Hugh Dickins
2012-02-20 23:38   ` Hugh Dickins
2012-02-21  7:08   ` Konstantin Khlebnikov
2012-02-21  7:08     ` Konstantin Khlebnikov
2012-02-21 20:12     ` Hugh Dickins
2012-02-21 20:12       ` Hugh Dickins
2012-02-21 21:35       ` Konstantin Khlebnikov
2012-02-21 21:35         ` Konstantin Khlebnikov
2012-02-21 22:12         ` Hugh Dickins
2012-02-21 22:12           ` Hugh Dickins
2012-02-22  3:43           ` Konstantin Khlebnikov
2012-02-22  3:43             ` Konstantin Khlebnikov
2012-02-22  6:09             ` Hugh Dickins
2012-02-22  6:09               ` Hugh Dickins
2012-02-23 14:21               ` Konstantin Khlebnikov
2012-02-23 14:21                 ` Konstantin Khlebnikov
2012-02-20 23:39 ` [PATCH 10/10] mm/memcg: per-memcg per-zone lru locking Hugh Dickins
2012-02-20 23:39   ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F44699D.3020309@openvz.org \
    --to=khlebnikov@openvz.org \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.