All of lore.kernel.org
 help / color / mirror / Atom feed
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"yamamoto@valinux.co.jp" <yamamoto@valinux.co.jp>,
	"riel@redhat.com" <riel@redhat.com>
Subject: Re: [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races.
Date: Wed, 20 Feb 2008 11:27:50 +0530	[thread overview]
Message-ID: <47BBC15E.5070405@linux.vnet.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0802191449490.6254@blonde.site>

Hugh Dickins wrote:
> On Tue, 19 Feb 2008, KAMEZAWA Hiroyuki wrote:
>> I'd like to start from RFC.
>>
>> In following code
>> ==
>>   lock_page_cgroup(page);
>>   pc = page_get_page_cgroup(page);
>>   unlock_page_cgroup(page);
>>
>>   access 'pc' later..
>> == (See, page_cgroup_move_lists())
>>
>> There is a race because 'pc' is not a stable value without lock_page_cgroup().
>> (mem_cgroup_uncharge can free this 'pc').
>>
>> For example, page_cgroup_move_lists() access pc without lock.
>> There is a small race window, between page_cgroup_move_lists()
>> and mem_cgroup_uncharge(). At uncharge, page_cgroup struct is immedieately
>> freed but move_list can access it after taking lru_lock.
>> (*) mem_cgroup_uncharge_page() can be called without zone->lru lock.
>>
>> This is not good manner.
>> .....
>> There is no quick fix (maybe). Moreover, I hear some people around me said
>> current memcontrol.c codes are very complicated.
>> I agree ;( ..it's caued by my work.
>>
>> I'd like to fix problems in clean way.
>> (Note: current -rc2 codes works well under heavy pressure. but there
>>  is possibility of race, I think.)
> 
> Yes, yes, indeed, I've been working away on this too.
> 
> Ever since the VM_BUG_ON(page_get_page_cgroup(page)) went into
> free_hot_cold_page (at my own prompting), I've been hitting it
> just very occasionally in my kernel build testing.  Was unable
> to reproduce it over the New Year, but a week or two ago found
> one machine and config on which it is relatively reproducible,
> pretty sure to happen within 12 hours.
> 
> And on Saturday evening at last identified the cause, exactly
> where you have: that unsafety in mem_cgroup_move_lists - which
> has the nice property of putting pages from the lru on to SLUB's
> freelist!
> 
> Unlike the unsafeties of force_empty, this is liable to hit anyone
> running with MEM_CONT compiled in, they don't have to be consciously
> using mem_cgroups at all.
> 
> (I consider that, by the way, quite a serious defect in the current
> mem_cgroup work: that a distro compiling it in for 1% of customers
> is then subjecting all to the mem_cgroup overhead - effectively
> doubled struct page size and unnecessary accounting overhead.  I
> believe there needs to be a way to opt out, a force_empty which
> sticks.  Yes, I know the page_cgroup which does that doubling of
> size is only allocated on demand, but every page cache page and
> every anonymous page is going to have one.  A kmem_cache for them
> will reduce the extra, but there still needs to be a way to opt
> out completely.)
> 

I've been thinking along these lines as well

1. Have a boot option to turn on/off the memory controller
2. Have a separate cache for the page_cgroup structure. I sent this suggestion
   out just yesterday or so.

I agree that these are necessary enhancements/changes.

[snip]



-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2008-02-20  6:02 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-19 12:54 [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races KAMEZAWA Hiroyuki
2008-02-19 15:40 ` Hugh Dickins
2008-02-19 15:54   ` kamezawa.hiroyu
2008-02-19 16:26     ` Hugh Dickins
2008-02-20  1:55       ` KAMEZAWA Hiroyuki
2008-02-20  2:05         ` YAMAMOTO Takashi
2008-02-20  2:15           ` KAMEZAWA Hiroyuki
2008-02-20  2:32             ` YAMAMOTO Takashi
2008-02-20  4:27               ` Hugh Dickins
2008-02-20  6:38       ` Balbir Singh
2008-02-20 11:00         ` Hugh Dickins
2008-02-20 11:32           ` Balbir Singh
2008-02-20 14:19             ` Hugh Dickins
2008-02-20  1:03   ` KAMEZAWA Hiroyuki
2008-02-20  4:14     ` Hugh Dickins
2008-02-20  4:37       ` KAMEZAWA Hiroyuki
2008-02-20  4:39         ` Hugh Dickins
2008-02-20  4:41           ` Balbir Singh
2008-02-20  6:40         ` Balbir Singh
2008-02-20  7:23           ` KAMEZAWA Hiroyuki
2008-02-20  3:14   ` KAMEZAWA Hiroyuki
2008-02-20  3:37     ` YAMAMOTO Takashi
2008-02-20  4:13       ` KAMEZAWA Hiroyuki
2008-02-20  4:32     ` Hugh Dickins
2008-02-20  5:57   ` Balbir Singh [this message]
2008-02-20  9:58     ` Hirokazu Takahashi
2008-02-20 10:06       ` Paul Menage
2008-02-20 10:11         ` Balbir Singh
2008-02-20 10:18           ` Paul Menage
2008-02-20 10:55             ` Balbir Singh
2008-02-20 11:21               ` KAMEZAWA Hiroyuki
2008-02-20 11:18                 ` Balbir Singh
2008-02-20 11:32                   ` KAMEZAWA Hiroyuki
2008-02-20 11:34                     ` Balbir Singh
2008-02-20 11:44                       ` Paul Menage
2008-02-20 11:41                   ` Paul Menage
2008-02-20 11:36       ` Balbir Singh
2008-02-20 11:55         ` Paul Menage
2008-02-21  2:49         ` Hirokazu Takahashi
2008-02-21  6:35           ` KAMEZAWA Hiroyuki
2008-02-21  9:07             ` Hirokazu Takahashi
2008-02-21  9:21               ` KAMEZAWA Hiroyuki
2008-02-21  9:28                 ` Balbir Singh
2008-02-21  9:44                   ` KAMEZAWA Hiroyuki
2008-02-22  3:31                     ` [RFC] Block I/O Cgroup Hirokazu Takahashi
2008-02-22  5:05                       ` KAMEZAWA Hiroyuki
2008-02-22  5:45                         ` Hirokazu Takahashi
2008-02-21  9:25               ` [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races Balbir Singh
2008-02-20  6:27   ` Hirokazu Takahashi
2008-02-20  6:50     ` KAMEZAWA Hiroyuki
2008-02-20  8:32       ` Clean up force_empty (Was Re: [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races.) KAMEZAWA Hiroyuki
2008-02-20 10:07         ` Clean up force_empty Hirokazu Takahashi
2008-02-22  9:24       ` [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races Hugh Dickins
2008-02-22 10:07         ` KAMEZAWA Hiroyuki
2008-02-22 10:25           ` Hugh Dickins
2008-02-22 10:34             ` KAMEZAWA Hiroyuki
2008-02-22 10:50         ` Hirokazu Takahashi
2008-02-22 11:14         ` Balbir Singh
2008-02-22 12:00           ` Hugh Dickins
2008-02-22 12:28             ` Balbir Singh
2008-02-22 12:53               ` Hugh Dickins
2008-02-25  3:18                 ` Balbir Singh
2008-02-20  5:00 ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47BBC15E.5070405@linux.vnet.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=yamamoto@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.