linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Sha Zhengju <handai.szj@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, gthelen@google.com,
	hannes@cmpxchg.org, hughd@google.com,
	Sha Zhengju <handai.szj@taobao.com>
Subject: Re: [PATCH] memcg: simplify lock of memcg page stat accounting
Date: Wed, 30 Jan 2013 10:12:29 +0100	[thread overview]
Message-ID: <20130130091229.GA16098@dhcp22.suse.cz> (raw)
In-Reply-To: <CAFj3OHXyWN+zUMAaSEOz2gCP7Bm6v4Zex=Rq=7A9CkHTp3j1UQ@mail.gmail.com>

On Tue 29-01-13 23:29:35, Sha Zhengju wrote:
> On Tue, Jan 29, 2013 at 8:41 AM, Kamezawa Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > (2013/01/26 20:12), Sha Zhengju wrote:
> >> From: Sha Zhengju <handai.szj@taobao.com>
> >>
> >> After removing duplicated information like PCG_*
> >> flags in 'struct page_cgroup'(commit 2ff76f1193), there's a problem
> >> between "move" and "page stat accounting"(only FILE_MAPPED is supported
> >> now but other stats will be added in future):
> >> assume CPU-A does "page stat accounting" and CPU-B does "move"
> >>
> >> CPU-A                        CPU-B
> >> TestSet PG_dirty
> >> (delay)               move_lock_mem_cgroup()
> >>                          if (PageDirty(page)) {
> >>                               old_memcg->nr_dirty --
> >>                               new_memcg->nr_dirty++
> >>                          }
> >>                          pc->mem_cgroup = new_memcg;
> >>                          move_unlock_mem_cgroup()
> >>
> >> move_lock_mem_cgroup()
> >> memcg = pc->mem_cgroup
> >> memcg->nr_dirty++
> >> move_unlock_mem_cgroup()
> >>
> >> while accounting information of new_memcg may be double-counted. So we
> >> use a bigger lock to solve this problem:  (commit: 89c06bd52f)
> >>
> >>        move_lock_mem_cgroup() <-- mem_cgroup_begin_update_page_stat()
> >>        TestSetPageDirty(page)
> >>        update page stats (without any checks)
> >>        move_unlock_mem_cgroup() <-- mem_cgroup_begin_update_page_stat()
> >>
> >>
> >> But this method also has its pros and cons: at present we use two layers
> >> of lock avoidance(memcg_moving and memcg->moving_account) then spinlock
> >> on memcg (see mem_cgroup_begin_update_page_stat()), but the lock granularity
> >> is a little bigger that not only the critical section but also some code
> >> logic is in the range of locking which may be deadlock prone. As dirty
> >> writeack stats are added, it gets into further difficulty with the page
> >> cache radix tree lock and it seems that the lock requires nesting.
> >> (https://lkml.org/lkml/2013/1/2/48)
> >>
> >> So in order to make the lock simpler and clearer and also avoid the 'nesting'
> >> problem, a choice may be:
> >> (CPU-A does "page stat accounting" and CPU-B does "move")
> >>
> >>         CPU-A                        CPU-B
> >>
> >> move_lock_mem_cgroup()
> >> memcg = pc->mem_cgroup
> >> TestSetPageDirty(page)
> >> move_unlock_mem_cgroup()
> >>                               move_lock_mem_cgroup()
> >>                               if (PageDirty) {
> >>                                    old_memcg->nr_dirty --;
> >>                                    new_memcg->nr_dirty ++;
> >>                               }
> >>                               pc->mem_cgroup = new_memcg
> >>                               move_unlock_mem_cgroup()
> >>
> >> memcg->nr_dirty ++
> >>
> >
> > Hmm. no race with file truncate ?
> >
> 
> Do you mean "dirty page accounting" racing with truncate?  Yes, if
> another one do truncate and set page->mapping=NULL just before CPU-A's
> 'memcg->nr_dirty ++', then it'll have no change to correct the figure
> back. So my rough idea now is to have some small changes to
> __set_page_dirty/__set_page_dirty_nobuffers that do SetDirtyPage
> inside ->tree_lock.
> 
> But, in current codes, is there any chance that
> mem_cgroup_move_account() racing with truncate that PageAnon is
> false(since page->mapping is cleared) but later in page_remove_rmap()
> the new memcg stats is over decrement...?

We are not checking page->mapping but rather page_mapped() which
checks page->_mapcount and that is protected from races with
mem_cgroup_move_account by mem_cgroup_begin_update_page_stat locking.
Makes sense?

> Call me silly...but I really get dizzy by those locks now, need to
> have a run to refresh my head... : (

Yeah, that part is funny for a certain reading of the word funny ;)
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-01-30  9:12 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-26 11:12 [PATCH] memcg: simplify lock of memcg page stat accounting Sha Zhengju
2013-01-28 14:10 ` Michal Hocko
2013-01-29 13:44   ` Sha Zhengju
2013-01-29 15:19     ` Michal Hocko
2013-01-29  0:41 ` Kamezawa Hiroyuki
2013-01-29 10:40   ` Michal Hocko
2013-01-29 15:29   ` Sha Zhengju
2013-01-30  9:12     ` Michal Hocko [this message]
2013-01-30 14:57       ` Sha Zhengju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130130091229.GA16098@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=gthelen@google.com \
    --cc=handai.szj@gmail.com \
    --cc=handai.szj@taobao.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).