linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Add file based RSS accounting for memory resource controller (v3)
Date: Wed, 22 Apr 2009 08:49:39 +0530	[thread overview]
Message-ID: <20090422031939.GQ19637@balbir.in.ibm.com> (raw)
In-Reply-To: <20090422090218.6d451a08.kamezawa.hiroyu@jp.fujitsu.com>

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-22 09:02:18]:

> On Tue, 21 Apr 2009 13:25:51 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > On Fri, 17 Apr 2009 19:48:38 +0530
> > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > 
> > >
> > > ...
> > >
> > > We currently don't track file RSS, the RSS we report is actually anon RSS.
> > > All the file mapped pages, come in through the page cache and get accounted
> > > there. This patch adds support for accounting file RSS pages. It should
> > > 
> > > 1. Help improve the metrics reported by the memory resource controller
> > > 2. Will form the basis for a future shared memory accounting heuristic
> > >    that has been proposed by Kamezawa.
> > > 
> > > Unfortunately, we cannot rename the existing "rss" keyword used in memory.stat
> > > to "anon_rss". We however, add "mapped_file" data and hope to educate the end
> > > user through documentation.
> > > 
> > > Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
> > >
> > > ...
> > >
> > > @@ -1096,6 +1135,10 @@ static int mem_cgroup_move_account(struct page_cgroup *pc,
> > >  	struct mem_cgroup_per_zone *from_mz, *to_mz;
> > >  	int nid, zid;
> > >  	int ret = -EBUSY;
> > > +	struct page *page;
> > > +	int cpu;
> > > +	struct mem_cgroup_stat *stat;
> > > +	struct mem_cgroup_stat_cpu *cpustat;
> > >  
> > >  	VM_BUG_ON(from == to);
> > >  	VM_BUG_ON(PageLRU(pc->page));
> > > @@ -1116,6 +1159,23 @@ static int mem_cgroup_move_account(struct page_cgroup *pc,
> > >  
> > >  	res_counter_uncharge(&from->res, PAGE_SIZE);
> > >  	mem_cgroup_charge_statistics(from, pc, false);
> > > +
> > > +	page = pc->page;
> > > +	if (page_is_file_cache(page) && page_mapped(page)) {
> > > +		cpu = smp_processor_id();
> > > +		/* Update mapped_file data for mem_cgroup "from" */
> > > +		stat = &from->stat;
> > > +		cpustat = &stat->cpustat[cpu];
> > > +		__mem_cgroup_stat_add_safe(cpustat, MEM_CGROUP_STAT_MAPPED_FILE,
> > > +						-1);
> > > +
> > > +		/* Update mapped_file data for mem_cgroup "to" */
> > > +		stat = &to->stat;
> > > +		cpustat = &stat->cpustat[cpu];
> > > +		__mem_cgroup_stat_add_safe(cpustat, MEM_CGROUP_STAT_MAPPED_FILE,
> > > +						1);
> > > +	}
> > 
> > This function (mem_cgroup_move_account()) does a trylock_page_cgroup()
> > and if that fails it will bale out, and the newly-added code will not
> > be executed.
> yes. and returns -EBUSY.
> 
> > 
> > What are the implications of this?  Does the missed accounting later get
> > performed somewhere, or does the error remain in place?
> > 
> no error just -BUSY. the caller (now, only force_empty is the caller) will do retry.
> 
> > That trylock_page_cgroup() really sucks - trylocks usually do.  Could
> > someone please raise a patch which completely documents the reasons for
> > its presence, and for any other uncommented/unobvious trylocks?
> > 
> > Where appropriate, the comment should explain why the trylock isn't
> > simply a bug - why it is safe and correct to omit the operations which
> > we wished to perform.
> > 
> > Thanks.
> > 
> Hmm...maybe we can replace trylock with lock, here.
> 
> IIRC, this has been trylock because the old routine uses other locks
> (mem_cgroup' zone mz->lru_lock) before calling this.
>    mz->lru_lock
>      lock_page_cgroup()
> And there was other routine which calls lock_page_cgroup()->mz->lru_lock.
>    lock_page_cgroup()
>         -> mz->lru_lock.
> 
> So, I used trylock here. But now, the lock(mz->lru_lock) is removed.
> I should check this.
> 
> Thank you for pointing out.
>

This is definitely worth looking into. Since we run force_empty() in a
while loop with some margin, we've probably avoided the problem. I
think this code needs a second look and refactoring.

 

-- 
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-04-22  3:20 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-15 12:05 [PATCH] Add file based RSS accounting for memory resource controller (v2) Balbir Singh
2009-04-16  0:53 ` KAMEZAWA Hiroyuki
2009-04-16  1:59   ` Balbir Singh
2009-04-16  2:02     ` KAMEZAWA Hiroyuki
2009-04-16  7:40       ` KAMEZAWA Hiroyuki
2009-04-16  8:15         ` KAMEZAWA Hiroyuki
2009-04-16 12:03           ` Balbir Singh
2009-04-17  0:14             ` KAMEZAWA Hiroyuki
2009-04-17  0:17               ` KAMEZAWA Hiroyuki
2009-04-17  1:40               ` Balbir Singh
2009-04-17  2:03                 ` KAMEZAWA Hiroyuki
2009-04-17  3:45                   ` Balbir Singh
2009-04-17  3:49                     ` KAMEZAWA Hiroyuki
2009-04-17  4:56                       ` Balbir Singh
2009-04-17  5:17                         ` KAMEZAWA Hiroyuki
2009-04-17  6:47                           ` Balbir Singh
2009-04-17  6:56                             ` KAMEZAWA Hiroyuki
2009-04-17 14:18                               ` [PATCH] Add file based RSS accounting for memory resource controller (v3) Balbir Singh
2009-04-17 16:30                                 ` KAMEZAWA Hiroyuki
2009-04-21  3:00                                   ` Balbir Singh
2009-04-21 20:25                                 ` Andrew Morton
2009-04-22  0:02                                   ` KAMEZAWA Hiroyuki
2009-04-22  3:16                                     ` [PATCH] memcg: remove trylock_page_cgroup KAMEZAWA Hiroyuki
2009-04-22  3:41                                       ` Andrew Morton
2009-04-22  4:41                                         ` KAMEZAWA Hiroyuki
2009-04-22  6:01                                           ` Andrew Morton
2009-04-22  6:13                                             ` KAMEZAWA Hiroyuki
2009-04-22  3:19                                     ` Balbir Singh [this message]
2009-04-16 12:14         ` [PATCH] Add file based RSS accounting for memory resource controller (v2) Balbir Singh
2009-04-16 23:57           ` KAMEZAWA Hiroyuki
2009-04-16  3:59     ` Bharata B Rao
2009-04-16  4:34       ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090422031939.GQ19637@balbir.in.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).