Re: Race condition between putback_lru_page and mem_cgroup_move_list

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	MinChan Kim <minchan.kim@gmail.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	linux-mm <linux-mm@kvack.org>, Rik van Riel <riel@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: Race condition between putback_lru_page and mem_cgroup_move_list
Date: Thu, 07 Aug 2008 18:12:14 +0530	[thread overview]
Message-ID: <489AEDA6.5040802@linux.vnet.ibm.com> (raw)
In-Reply-To: <1218041585.6173.45.camel@lts-notebook>

Lee Schermerhorn wrote:
> On Mon, 2008-08-04 at 23:22 +0530, Balbir Singh wrote:
>> KOSAKI Motohiro wrote:
>>> Hi
>>>
>>>>> I think this is a race condition if mem_cgroup_move_lists's comment isn't right.
>>>>> I am not sure that it was already known problem.
>>>>>
>>>>> mem_cgroup_move_lists assume the appropriate zone's lru lock is already held.
>>>>> but putback_lru_page calls mem_cgroup_move_lists without holding lru_lock.
>>>> Hmmm, the comment on mem_cgroup_move_lists() does say this.  Although,
>>>> reading thru' the code, I can't see why it requires this.  But then it's
>>>> Monday, here...
>>> I also think zone's lru lock is unnecessary.
>>> So, I guess below "it" indicate lock_page_cgroup, not zone lru lock.
>>>
>> We need zone LRU lock, since the reclaim paths hold them. Not sure if I
>> understand why you call zone's LRU lock unnecessary, could you elaborate please?
> 
> Hi, Balbir:
> 
> Sorry for the delay in responding.  Distracted...
> 

No problem at all.

> I think that perhaps the zone's LRU lock is unnecessary because I didn't
> see anything in mem_cgroup_move_lists() or it's callees that needed
> protection by the zone lru_lock.  
> 
> Looking at the call sites in the reclaim paths [in
> shrink_[in]active_page()] and activate_page(), they are holding the zone
> lru_lock because they are manipulating the lru lists and/or zone
> statistics. 

Precisely, my point below about updating statistics for zones and you mention
below that the zone LRU excludes the race I mentioned in (1). I am a bit
confused with that statement, do you agree that zone lru_lock excludes the race
and is therefore required?

 The places where pages are moved to a new lru list is where
> you want to insert calls to mem_cgroup_move_lists(), so I think they
> just happen to fall under the zone lru lock.  
> 
> Now, in a subsequent message in this thread, you ask:
> 
> "1. What happens if a global reclaim is in progress at the same time as
> memory cgroup reclaim and they are both looking at the same page?"
> 
> This should not happen, I think.  Racing global and memory cgroup calls
> to __isolate_lru_page() are mutually excluded by the zone lru_lock taken
> in shrink_[in]active_page().

Yes, I was referring to needing the zone lru_lock

  In putback_lru_page(), where we call
> mem_cgroup_move_lists() without holding the zone lru_lock, we've either
> queued up the page for adding to one of the [in]active lists via the
> pagevecs, or we've already moved it to the unevictable list.  If
> mem_cgroup_isolate_pages() finds a page on one of the mz lists before it
> has been drained to the LRU, it will [rightly] skip the page because
> it's "!PageLRU(page)".
> 
> 
> In same message, you state:
> 
> "2. In the shared reclaim infrastructure, we move pages and update
> statistics for pages belonging to a particular zone in a particular
> cgroup."
> 
> Sorry, I don't understand your point.  Are you concerned that the stats
> can get out of sync?  I suppose that, in general, if we called
> mem_cgroup_move_lists() from just anywhere without zone lru_lock
> protection, we could have problems.  In the case of putback_lru_page(),
> again, we've already put the page back on the global unevictable list
> and updated the global stats, or it's on it's way to an [in]active list
> via the pagevecs.  The stats will be updated when the pagevecs are
> drained.
> 
> I think we're OK without explicit zone lru locking around the call to
> mem_cgroup_move_lists() and the global lru list additions in
> putback_lru_page().   
> 

I think I understand what you are stating clearly

We don't need the zone lru_lock in putback_lru_page(). Am I missing something or
do I have it right? (It's Thursday and one of my legs is already in the weekend).


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

WARNING: multiple messages have this Message-ID (diff)

From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	MinChan Kim <minchan.kim@gmail.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	linux-mm <linux-mm@kvack.org>, Rik van Riel <riel@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: Race condition between putback_lru_page and mem_cgroup_move_list
Date: Thu, 07 Aug 2008 18:12:14 +0530	[thread overview]
Message-ID: <489AEDA6.5040802@linux.vnet.ibm.com> (raw)
In-Reply-To: <1218041585.6173.45.camel@lts-notebook>

Lee Schermerhorn wrote:
> On Mon, 2008-08-04 at 23:22 +0530, Balbir Singh wrote:
>> KOSAKI Motohiro wrote:
>>> Hi
>>>
>>>>> I think this is a race condition if mem_cgroup_move_lists's comment isn't right.
>>>>> I am not sure that it was already known problem.
>>>>>
>>>>> mem_cgroup_move_lists assume the appropriate zone's lru lock is already held.
>>>>> but putback_lru_page calls mem_cgroup_move_lists without holding lru_lock.
>>>> Hmmm, the comment on mem_cgroup_move_lists() does say this.  Although,
>>>> reading thru' the code, I can't see why it requires this.  But then it's
>>>> Monday, here...
>>> I also think zone's lru lock is unnecessary.
>>> So, I guess below "it" indicate lock_page_cgroup, not zone lru lock.
>>>
>> We need zone LRU lock, since the reclaim paths hold them. Not sure if I
>> understand why you call zone's LRU lock unnecessary, could you elaborate please?
> 
> Hi, Balbir:
> 
> Sorry for the delay in responding.  Distracted...
> 

No problem at all.

> I think that perhaps the zone's LRU lock is unnecessary because I didn't
> see anything in mem_cgroup_move_lists() or it's callees that needed
> protection by the zone lru_lock.  
> 
> Looking at the call sites in the reclaim paths [in
> shrink_[in]active_page()] and activate_page(), they are holding the zone
> lru_lock because they are manipulating the lru lists and/or zone
> statistics. 

Precisely, my point below about updating statistics for zones and you mention
below that the zone LRU excludes the race I mentioned in (1). I am a bit
confused with that statement, do you agree that zone lru_lock excludes the race
and is therefore required?

 The places where pages are moved to a new lru list is where
> you want to insert calls to mem_cgroup_move_lists(), so I think they
> just happen to fall under the zone lru lock.  
> 
> Now, in a subsequent message in this thread, you ask:
> 
> "1. What happens if a global reclaim is in progress at the same time as
> memory cgroup reclaim and they are both looking at the same page?"
> 
> This should not happen, I think.  Racing global and memory cgroup calls
> to __isolate_lru_page() are mutually excluded by the zone lru_lock taken
> in shrink_[in]active_page().

Yes, I was referring to needing the zone lru_lock

  In putback_lru_page(), where we call
> mem_cgroup_move_lists() without holding the zone lru_lock, we've either
> queued up the page for adding to one of the [in]active lists via the
> pagevecs, or we've already moved it to the unevictable list.  If
> mem_cgroup_isolate_pages() finds a page on one of the mz lists before it
> has been drained to the LRU, it will [rightly] skip the page because
> it's "!PageLRU(page)".
> 
> 
> In same message, you state:
> 
> "2. In the shared reclaim infrastructure, we move pages and update
> statistics for pages belonging to a particular zone in a particular
> cgroup."
> 
> Sorry, I don't understand your point.  Are you concerned that the stats
> can get out of sync?  I suppose that, in general, if we called
> mem_cgroup_move_lists() from just anywhere without zone lru_lock
> protection, we could have problems.  In the case of putback_lru_page(),
> again, we've already put the page back on the global unevictable list
> and updated the global stats, or it's on it's way to an [in]active list
> via the pagevecs.  The stats will be updated when the pagevecs are
> drained.
> 
> I think we're OK without explicit zone lru locking around the call to
> mem_cgroup_move_lists() and the global lru list additions in
> putback_lru_page().   
> 

I think I understand what you are stating clearly

We don't need the zone lru_lock in putback_lru_page(). Am I missing something or
do I have it right? (It's Thursday and one of my legs is already in the weekend).


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2008-08-07 12:42 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-04 14:36 Race condition between putback_lru_page and mem_cgroup_move_list MinChan Kim
2008-08-04 14:36 ` MinChan Kim
2008-08-04 15:31 ` Lee Schermerhorn
2008-08-04 15:31   ` Lee Schermerhorn
2008-08-04 16:37   ` KOSAKI Motohiro
2008-08-04 16:37     ` KOSAKI Motohiro
2008-08-04 17:52     ` Balbir Singh
2008-08-04 17:52       ` Balbir Singh
2008-08-04 23:52       ` MinChan Kim
2008-08-04 23:52         ` MinChan Kim
2008-08-05  3:49       ` kamezawa.hiroyu
2008-08-05  3:49         ` kamezawa.hiroyu
2008-08-05  6:20       ` KOSAKI Motohiro
2008-08-05  6:20         ` KOSAKI Motohiro
2008-08-05 10:46         ` Balbir Singh
2008-08-05 10:46           ` Balbir Singh
2008-08-05 11:19           ` KOSAKI Motohiro
2008-08-05 11:19             ` KOSAKI Motohiro
2008-08-05 11:38             ` KOSAKI Motohiro
2008-08-05 11:38               ` KOSAKI Motohiro
2008-08-06 16:53       ` Lee Schermerhorn
2008-08-06 16:53         ` Lee Schermerhorn
2008-08-07 11:00         ` KOSAKI Motohiro
2008-08-07 11:00           ` KOSAKI Motohiro
2008-08-07 11:27           ` Lee Schermerhorn
2008-08-07 11:27             ` Lee Schermerhorn
2008-08-07 12:42         ` Balbir Singh [this message]
2008-08-07 12:42           ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=489AEDA6.5040802@linux.vnet.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.