linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"yamamoto@valinux.co.jp" <yamamoto@valinux.co.jp>,
	"riel@redhat.com" <riel@redhat.com>
Subject: Re: [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races.
Date: Wed, 20 Feb 2008 11:27:50 +0530	[thread overview]
Message-ID: <47BBC15E.5070405@linux.vnet.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0802191449490.6254@blonde.site>

Hugh Dickins wrote:
> On Tue, 19 Feb 2008, KAMEZAWA Hiroyuki wrote:
>> I'd like to start from RFC.
>>
>> In following code
>> ==
>>   lock_page_cgroup(page);
>>   pc = page_get_page_cgroup(page);
>>   unlock_page_cgroup(page);
>>
>>   access 'pc' later..
>> == (See, page_cgroup_move_lists())
>>
>> There is a race because 'pc' is not a stable value without lock_page_cgroup().
>> (mem_cgroup_uncharge can free this 'pc').
>>
>> For example, page_cgroup_move_lists() access pc without lock.
>> There is a small race window, between page_cgroup_move_lists()
>> and mem_cgroup_uncharge(). At uncharge, page_cgroup struct is immedieately
>> freed but move_list can access it after taking lru_lock.
>> (*) mem_cgroup_uncharge_page() can be called without zone->lru lock.
>>
>> This is not good manner.
>> .....
>> There is no quick fix (maybe). Moreover, I hear some people around me said
>> current memcontrol.c codes are very complicated.
>> I agree ;( ..it's caued by my work.
>>
>> I'd like to fix problems in clean way.
>> (Note: current -rc2 codes works well under heavy pressure. but there
>>  is possibility of race, I think.)
> 
> Yes, yes, indeed, I've been working away on this too.
> 
> Ever since the VM_BUG_ON(page_get_page_cgroup(page)) went into
> free_hot_cold_page (at my own prompting), I've been hitting it
> just very occasionally in my kernel build testing.  Was unable
> to reproduce it over the New Year, but a week or two ago found
> one machine and config on which it is relatively reproducible,
> pretty sure to happen within 12 hours.
> 
> And on Saturday evening at last identified the cause, exactly
> where you have: that unsafety in mem_cgroup_move_lists - which
> has the nice property of putting pages from the lru on to SLUB's
> freelist!
> 
> Unlike the unsafeties of force_empty, this is liable to hit anyone
> running with MEM_CONT compiled in, they don't have to be consciously
> using mem_cgroups at all.
> 
> (I consider that, by the way, quite a serious defect in the current
> mem_cgroup work: that a distro compiling it in for 1% of customers
> is then subjecting all to the mem_cgroup overhead - effectively
> doubled struct page size and unnecessary accounting overhead.  I
> believe there needs to be a way to opt out, a force_empty which
> sticks.  Yes, I know the page_cgroup which does that doubling of
> size is only allocated on demand, but every page cache page and
> every anonymous page is going to have one.  A kmem_cache for them
> will reduce the extra, but there still needs to be a way to opt
> out completely.)
> 

I've been thinking along these lines as well

1. Have a boot option to turn on/off the memory controller
2. Have a separate cache for the page_cgroup structure. I sent this suggestion
   out just yesterday or so.

I agree that these are necessary enhancements/changes.

[snip]



-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2008-02-20  6:02 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-19 12:54 [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races KAMEZAWA Hiroyuki
2008-02-19 15:40 ` Hugh Dickins
2008-02-20  1:03   ` KAMEZAWA Hiroyuki
2008-02-20  4:14     ` Hugh Dickins
2008-02-20  4:37       ` KAMEZAWA Hiroyuki
2008-02-20  4:39         ` Hugh Dickins
2008-02-20  4:41           ` Balbir Singh
2008-02-20  6:40         ` Balbir Singh
2008-02-20  7:23           ` KAMEZAWA Hiroyuki
2008-02-20  3:14   ` KAMEZAWA Hiroyuki
2008-02-20  3:37     ` YAMAMOTO Takashi
2008-02-20  4:13       ` KAMEZAWA Hiroyuki
2008-02-20  4:32     ` Hugh Dickins
2008-02-20  5:57   ` Balbir Singh [this message]
2008-02-20  9:58     ` Hirokazu Takahashi
2008-02-20 10:06       ` Paul Menage
2008-02-20 10:11         ` Balbir Singh
2008-02-20 10:18           ` Paul Menage
2008-02-20 10:55             ` Balbir Singh
2008-02-20 11:21               ` KAMEZAWA Hiroyuki
2008-02-20 11:18                 ` Balbir Singh
2008-02-20 11:32                   ` KAMEZAWA Hiroyuki
2008-02-20 11:34                     ` Balbir Singh
2008-02-20 11:44                       ` Paul Menage
2008-02-20 11:41                   ` Paul Menage
2008-02-20 11:36       ` Balbir Singh
2008-02-20 11:55         ` Paul Menage
2008-02-21  2:49         ` Hirokazu Takahashi
2008-02-21  6:35           ` KAMEZAWA Hiroyuki
2008-02-21  9:07             ` Hirokazu Takahashi
2008-02-21  9:21               ` KAMEZAWA Hiroyuki
2008-02-21  9:28                 ` Balbir Singh
2008-02-21  9:44                   ` KAMEZAWA Hiroyuki
2008-02-22  3:31                     ` [RFC] Block I/O Cgroup Hirokazu Takahashi
2008-02-22  5:05                       ` KAMEZAWA Hiroyuki
2008-02-22  5:45                         ` Hirokazu Takahashi
2008-02-21  9:25               ` [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races Balbir Singh
2008-02-20  6:27   ` Hirokazu Takahashi
2008-02-20  6:50     ` KAMEZAWA Hiroyuki
2008-02-20  8:32       ` Clean up force_empty (Was Re: [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races.) KAMEZAWA Hiroyuki
2008-02-20 10:07         ` Clean up force_empty Hirokazu Takahashi
2008-02-22  9:24       ` [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races Hugh Dickins
2008-02-22 10:07         ` KAMEZAWA Hiroyuki
2008-02-22 10:25           ` Hugh Dickins
2008-02-22 10:34             ` KAMEZAWA Hiroyuki
2008-02-22 10:50         ` Hirokazu Takahashi
2008-02-22 11:14         ` Balbir Singh
2008-02-22 12:00           ` Hugh Dickins
2008-02-22 12:28             ` Balbir Singh
2008-02-22 12:53               ` Hugh Dickins
2008-02-25  3:18                 ` Balbir Singh
2008-02-19 15:54 ` kamezawa.hiroyu
2008-02-19 16:26   ` Hugh Dickins
2008-02-20  1:55     ` KAMEZAWA Hiroyuki
2008-02-20  2:05       ` YAMAMOTO Takashi
2008-02-20  2:15         ` KAMEZAWA Hiroyuki
2008-02-20  2:32           ` YAMAMOTO Takashi
2008-02-20  4:27             ` Hugh Dickins
2008-02-20  6:38     ` Balbir Singh
2008-02-20 11:00       ` Hugh Dickins
2008-02-20 11:32         ` Balbir Singh
2008-02-20 14:19           ` Hugh Dickins
2008-02-20  5:00 ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47BBC15E.5070405@linux.vnet.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=yamamoto@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).