Re: [PATCH 05/15] memcg: fix VM_BUG_ON from page migration

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Hirokazu Takahashi <taka@valinux.co.jp>,
	YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
	linux-mm@kvack.org
Subject: Re: [PATCH 05/15] memcg: fix VM_BUG_ON from page migration
Date: Wed, 27 Feb 2008 19:13:00 +0530	[thread overview]
Message-ID: <47C568E4.5060504@linux.vnet.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0802271257540.8683@blonde.site>

Hugh Dickins wrote:
> On Wed, 27 Feb 2008, Balbir Singh wrote:
>> * Hugh Dickins <hugh@veritas.com> [2008-02-25 23:39:23]:
>>
>>> Page migration gave me free_hot_cold_page's VM_BUG_ON page->page_cgroup.
>>> remove_migration_pte was calling mem_cgroup_charge on the new page whenever
>>> it found a swap pte, before it had determined it to be a migration entry.
>>> That left a surplus reference count on the page_cgroup, so it was still
>>> attached when the page was later freed.
>>>
>>> Move that mem_cgroup_charge down to where we're sure it's a migration entry.
>>> We were already under i_mmap_lock or anon_vma->lock, so its GFP_KERNEL was
>>> already inappropriate: change that to GFP_ATOMIC.
>>>
>> One side effect I see of this patch is that the page_cgroup lock and
>> the lru_lock can now be taken from within i_mmap_lock or
>> anon_vma->lock.
> 
> That's not a side-effect of this patch, but it is something which was
> already being done there, and you're absolutely right to draw attention
> to it, thank you.
> 
> Although I mentioned they were held in the comment, it hadn't really
> dawned on me how unwelcome that is: it's not a violation, and lockdep
> doesn't protest, but we'd all be happier not to interweave those
> otherwise independent locks in that one place.
> 
> Oh, hold on, no, it's not that one place.  It's a well-established
> nesting of locks, as when mem_cgroup_uncharge_page is called by
> page_remove_rmap from try_to_unmap_one.  Panic over!  But we'd
> better add memcontrol's locks to the hierarchies shown in
> filemap.c and in rmap.c.
> 

Yes, I agree. I also agree it is not a side-effect and we're already doing that.
But well worth documenting in the places you've mentioned.


>>> -	if (mem_cgroup_charge(new, mm, GFP_KERNEL)) {
>>> -		pte_unmap(ptep);
>>> -		return;
>>> -	}
>>> -
>>>   	ptl = pte_lockptr(mm, pmd);
>>>   	spin_lock(ptl);
>>>  	pte = *ptep;
>>> @@ -169,6 +164,20 @@ static void remove_migration_pte(struct 
>>>  	if (!is_migration_entry(entry) || migration_entry_to_page(entry) != old)
>>>  		goto out;
>> Is it not easier to uncharge here then to move to the charging to the
>> context below? Do you suspect this will be a common operation (so we
>> might end up charging/uncharing more frequently?)
> 
> In what way would it be easier to charge too early, then uncharge
> when we find it was wrong, than simply to charge once we know it's
> right, as the patch does?
> 
> If we were not already in atomic context, it would definitely be
> better to do it the way you suggest, with GFP_KERNEL not GFP_ATOMIC;
> but we're already in atomic context, so I cannot see any advantage
> to doing it your way.
> 
> What would be a real improvement is a way of doing it outside the
> atomic context: I've given that little thought, but it's not obvious
> how.  And really the wider issue of force_empty inconsistencies is
> more important than this singular wart in page migration.

Yes, you are absolutely right, we called the route with the anon_vma lock or the
i_mmap_lock held. So moving it outside does not really help. In fact this patch
fixes the bug where we use GFP_KERNEL from within a lock.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

PS: I know I've been slow in reviewing the patches. I am trying to run and
compile each patch as I review it. Please bear with me while I catch up with the
series of patches.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2008-02-27 13:43 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-25 23:34 [PATCH 00/15] memcg: fixes and cleanups Hugh Dickins
2008-02-25 23:35 ` [PATCH 01/15] memcg: mm_match_cgroup not vm_match_cgroup Hugh Dickins
2008-02-26  0:39   ` David Rientjes
2008-02-26  3:27     ` Hugh Dickins
2008-02-26  2:41   ` Balbir Singh
2008-02-26 23:46   ` KAMEZAWA Hiroyuki
2008-02-28  3:47   ` Andrew Morton
2008-02-28  7:19     ` David Rientjes
2008-02-28  7:26       ` Andrew Morton
2008-02-28  8:08         ` Hugh Dickins
2008-02-25 23:36 ` [PATCH 02/15] memcg: move_lists on page not page_cgroup Hugh Dickins
2008-02-26 15:52   ` Balbir Singh
2008-02-26 23:45   ` KAMEZAWA Hiroyuki
2008-02-25 23:37 ` [PATCH 03/15] memcg: page_cache_release not __free_page Hugh Dickins
2008-02-26 16:02   ` Balbir Singh
2008-02-26 23:38   ` KAMEZAWA Hiroyuki
2008-02-25 23:38 ` [PATCH 04/15] memcg: when do_swap's do_wp_page fails Hugh Dickins
2008-02-26 23:41   ` KAMEZAWA Hiroyuki
2008-02-27  5:08   ` Balbir Singh
2008-02-27 12:57     ` Hugh Dickins
2008-02-25 23:39 ` [PATCH 05/15] memcg: fix VM_BUG_ON from page migration Hugh Dickins
2008-02-26  1:30   ` KAMEZAWA Hiroyuki
2008-02-27  5:52   ` Balbir Singh
2008-02-27 13:23     ` Hugh Dickins
2008-02-27 13:43       ` Balbir Singh [this message]
2008-02-25 23:40 ` [PATCH 06/15] memcg: bad page if page_cgroup when free Hugh Dickins
2008-02-26 23:44   ` KAMEZAWA Hiroyuki
2008-02-27  8:38   ` Balbir Singh
2008-02-25 23:41 ` [PATCH 07/15] memcg: mem_cgroup_charge never NULL Hugh Dickins
2008-02-26  1:32   ` KAMEZAWA Hiroyuki
2008-02-27  8:42   ` Balbir Singh
2008-02-25 23:42 ` [PATCH 08/15] memcg: remove mem_cgroup_uncharge Hugh Dickins
2008-02-26  1:34   ` KAMEZAWA Hiroyuki
2008-02-28 18:22   ` Balbir Singh
2008-02-25 23:43 ` [PATCH 09/15] memcg: memcontrol whitespace cleanups Hugh Dickins
2008-02-25 23:44 ` [PATCH 10/15] memcg: memcontrol uninlined and static Hugh Dickins
2008-02-26  1:36   ` KAMEZAWA Hiroyuki
2008-02-25 23:46 ` [PATCH 11/15] memcg: remove clear_page_cgroup and atomics Hugh Dickins
2008-02-26  1:38   ` KAMEZAWA Hiroyuki
2008-02-25 23:47 ` [PATCH 12/15] memcg: css_put after remove_list Hugh Dickins
2008-02-26  1:39   ` KAMEZAWA Hiroyuki
2008-02-25 23:49 ` [PATCH 13/15] memcg: fix mem_cgroup_move_lists locking Hugh Dickins
2008-02-26  1:43   ` KAMEZAWA Hiroyuki
2008-02-26  2:56     ` Hugh Dickins
2008-02-25 23:50 ` [PATCH 14/15] memcg: simplify force_empty and move_lists Hugh Dickins, Hirokazu Takahashi
2008-02-26  1:48   ` KAMEZAWA Hiroyuki
2008-02-26  3:23     ` Hugh Dickins
2008-02-26  4:09       ` KAMEZAWA Hiroyuki
2008-02-25 23:51 ` [PATCH 15/15] memcg: fix oops on NULL lru list Hugh Dickins
2008-02-26  1:26 ` [PATCH 00/15] memcg: fixes and cleanups KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47C568E4.5060504@linux.vnet.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=taka@valinux.co.jp \
    --cc=yamamoto@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.