From: Andrea Arcangeli <aarcange@redhat.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: linux-mm@kvack.org, Marcelo Tosatti <mtosatti@redhat.com>,
Adam Litke <agl@us.ibm.com>, Avi Kivity <avi@redhat.com>,
Izik Eidus <ieidus@redhat.com>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
Mel Gorman <mel@csn.ul.ie>, Andi Kleen <andi@firstfloor.org>,
Dave Hansen <dave@linux.vnet.ibm.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Ingo Molnar <mingo@elte.hu>, Mike Travis <travis@sgi.com>,
Christoph Lameter <cl@linux-foundation.org>,
Chris Wright <chrisw@sous-sol.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 27 of 28] memcg compound
Date: Fri, 18 Dec 2009 17:02:17 +0100 [thread overview]
Message-ID: <20091218160217.GO29790@random.random> (raw)
In-Reply-To: <20091218102701.7fa7124d.kamezawa.hiroyu@jp.fujitsu.com>
On Fri, Dec 18, 2009 at 10:27:01AM +0900, KAMEZAWA Hiroyuki wrote:
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -1288,15 +1288,20 @@ static atomic_t memcg_drain_count;
> > * cgroup which is not current target, returns false. This stock will be
> > * refilled.
> > */
> > -static bool consume_stock(struct mem_cgroup *mem)
> > +static bool consume_stock(struct mem_cgroup *mem, int *page_size)
> > {
> > struct memcg_stock_pcp *stock;
> > bool ret = true;
> >
> > stock = &get_cpu_var(memcg_stock);
> > - if (mem == stock->cached && stock->charge)
> > - stock->charge -= PAGE_SIZE;
> > - else /* need to call res_counter_charge */
> > + if (mem == stock->cached && stock->charge) {
> > + if (*page_size > stock->charge) {
> > + *page_size -= stock->charge;
> > + stock->charge = 0;
> > + ret = false;
> > + } else
> > + stock->charge -= *page_size;
> > + } else /* need to call res_counter_charge */
> > ret = false;
>
> I feel we should we skip this per-cpu caching method because counter overflow
> rate is the key for this workaround.
> Then,
> if (size == PAGESIZE)
> consume_stock()
> seems better to me.
Ok, I did it the way I did it, to be sure to never underestimate the
still available reserved space. Wasting 128k per cgroup seems no big
deal to me, so I can skip it. Clearly performace-wise including the
per-cpu reservation was worthless on 2M pages (reservation is 128k...)
it was only to keep accounting as strict as it should be because the
other code there really went all way down to csize = page_size in case
of failure and tried again. But then it didn't send IPI to other cpus
to release those. So basically you should also remove that "retry"
event which looks pretty worthless with other cpu queues not drained
before retrying and hugepages bypassing the cache entirely. Assume the
cache is an error of 128k*nr_cpus.
> > put_cpu_var(memcg_stock);
> > return ret;
> > @@ -1401,13 +1406,13 @@ static int __cpuinit memcg_stock_cpu_cal
> > * oom-killer can be invoked.
> > */
> > static int __mem_cgroup_try_charge(struct mm_struct *mm,
> > - gfp_t gfp_mask, struct mem_cgroup **memcg,
> > - bool oom, struct page *page)
> > + gfp_t gfp_mask, struct mem_cgroup **memcg,
> > + bool oom, struct page *page, int page_size)
> > {
> > struct mem_cgroup *mem, *mem_over_limit;
> > int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
> > struct res_counter *fail_res;
> > - int csize = CHARGE_SIZE;
> > + int csize = max(page_size, (int) CHARGE_SIZE);
> >
> we need max() ?
Not sure I understand the question. max(2M, 128k) looks ok.
> I think should skip this.
> And skip this.
as per above, ok.
> Ah..Hmm...this will be much complicated after Nishimura's "task move" method
> is merged. But ok, for this patch itself.
So I hope is that my patch goes in first so I don't have to make the
much more complicated fix ahahaa ;). Just kidding... Well it's up to
you how you want to handle this.
> Thank you! Seems simpler than expected!
You're welcome. Thanks for the review. So how we want to go from here,
you will incorporate those changes yourself so I only have to maintain
the huge_memory.c part that depends on the above? The above is
transparent hugepage agnostic. For the time being I guess I am forced
to also keep it in my patchset otherwise kernel would fail if somebody
uses mem cgroup, but the ideal is to keep this patch in sync and I
drop it as soon as it goes in.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-12-18 16:03 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-17 19:00 [PATCH 00 of 28] Transparent Hugepage support #2 Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 01 of 28] compound_lock Andrea Arcangeli
2009-12-17 19:46 ` Christoph Lameter
2009-12-18 14:27 ` Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 02 of 28] alter compound get_page/put_page Andrea Arcangeli
2009-12-17 19:50 ` Christoph Lameter
2009-12-18 14:30 ` Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 03 of 28] clear compound mapping Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 04 of 28] add native_set_pmd_at Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 05 of 28] add pmd paravirt ops Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 06 of 28] no paravirt version of pmd ops Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 07 of 28] export maybe_mkwrite Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 08 of 28] comment reminder in destroy_compound_page Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 09 of 28] config_transparent_hugepage Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 10 of 28] add pmd mangling functions to x86 Andrea Arcangeli
2009-12-18 18:56 ` Mel Gorman
2009-12-19 15:27 ` Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 11 of 28] add pmd mangling generic functions Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 12 of 28] special pmd_trans_* functions Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 13 of 28] bail out gup_fast on freezed pmd Andrea Arcangeli
2009-12-18 18:59 ` Mel Gorman
2009-12-19 15:48 ` Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 14 of 28] pte alloc trans splitting Andrea Arcangeli
2009-12-18 19:03 ` Mel Gorman
2009-12-19 15:59 ` Andrea Arcangeli
2009-12-21 19:57 ` Mel Gorman
2009-12-17 19:00 ` [PATCH 15 of 28] add pmd mmu_notifier helpers Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 16 of 28] clear page compound Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 17 of 28] add pmd_huge_pte to mm_struct Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 18 of 28] ensure mapcount is taken on head pages Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 19 of 28] split_huge_page_mm/vma Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 20 of 28] split_huge_page paging Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 21 of 28] pmd_trans_huge migrate bugcheck Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 22 of 28] clear_huge_page fix Andrea Arcangeli
2009-12-18 19:16 ` Mel Gorman
2009-12-17 19:00 ` [PATCH 23 of 28] clear_copy_huge_page Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 24 of 28] kvm mmu transparent hugepage support Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 25 of 28] transparent hugepage core Andrea Arcangeli
2009-12-18 20:03 ` Mel Gorman
2009-12-19 16:41 ` Andrea Arcangeli
2009-12-21 20:31 ` Mel Gorman
2009-12-23 0:06 ` Andrea Arcangeli
2009-12-23 6:09 ` Paul Mundt
2010-01-03 18:38 ` Mel Gorman
2010-01-04 15:49 ` Andrea Arcangeli
2010-01-04 16:58 ` Christoph Lameter
2010-01-04 6:16 ` Daisuke Nishimura
2010-01-04 16:04 ` Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 26 of 28] madvise(MADV_HUGEPAGE) Andrea Arcangeli
2009-12-17 19:00 ` [PATCH 27 of 28] memcg compound Andrea Arcangeli
2009-12-18 1:27 ` KAMEZAWA Hiroyuki
2009-12-18 16:02 ` Andrea Arcangeli [this message]
2009-12-17 19:00 ` [PATCH 28 of 28] memcg huge memory Andrea Arcangeli
2009-12-18 1:33 ` KAMEZAWA Hiroyuki
2009-12-18 16:04 ` Andrea Arcangeli
2009-12-18 23:06 ` KAMEZAWA Hiroyuki
2009-12-20 18:39 ` Andrea Arcangeli
2009-12-21 0:26 ` KAMEZAWA Hiroyuki
2009-12-21 1:24 ` Daisuke Nishimura
2009-12-21 3:52 ` KAMEZAWA Hiroyuki
2009-12-21 4:33 ` Daisuke Nishimura
2009-12-25 4:17 ` Daisuke Nishimura
2009-12-25 4:37 ` KAMEZAWA Hiroyuki
2009-12-24 10:00 ` Balbir Singh
2009-12-24 11:40 ` Andrea Arcangeli
2009-12-24 12:07 ` Balbir Singh
2009-12-17 19:54 ` [PATCH 00 of 28] Transparent Hugepage support #2 Christoph Lameter
2009-12-17 19:58 ` Rik van Riel
2009-12-17 20:09 ` Christoph Lameter
2009-12-18 5:12 ` Ingo Molnar
2009-12-18 6:18 ` KOSAKI Motohiro
2009-12-18 18:28 ` Christoph Lameter
2009-12-18 18:41 ` Dave Hansen
2009-12-18 19:17 ` Mike Travis
2009-12-18 19:28 ` Swap on flash SSDs Dave Hansen
2009-12-18 19:38 ` Andi Kleen
2009-12-18 19:39 ` Ingo Molnar
2009-12-18 20:13 ` Linus Torvalds
2009-12-18 20:31 ` Ingo Molnar
2009-12-19 18:38 ` Jörn Engel
2009-12-18 14:05 ` [PATCH 00 of 28] Transparent Hugepage support #2 Andrea Arcangeli
2009-12-18 18:33 ` Christoph Lameter
2009-12-19 15:09 ` Andrea Arcangeli
2009-12-17 20:47 ` Mike Travis
2009-12-18 3:28 ` Rik van Riel
2009-12-18 14:12 ` Andrea Arcangeli
2009-12-18 12:52 ` Avi Kivity
2009-12-18 18:47 ` Dave Hansen
2009-12-19 15:20 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091218160217.GO29790@random.random \
--to=aarcange@redhat.com \
--cc=agl@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=avi@redhat.com \
--cc=benh@kernel.crashing.org \
--cc=chrisw@sous-sol.org \
--cc=cl@linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=ieidus@redhat.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mingo@elte.hu \
--cc=mtosatti@redhat.com \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
--cc=travis@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).