From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51D9FC433E0 for ; Tue, 9 Mar 2021 12:33:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 96B0665272 for ; Tue, 9 Mar 2021 12:33:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 96B0665272 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 28D768D00EB; Tue, 9 Mar 2021 07:33:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 263C68D007F; Tue, 9 Mar 2021 07:33:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 12C9C8D00EB; Tue, 9 Mar 2021 07:33:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0138.hostedemail.com [216.40.44.138]) by kanga.kvack.org (Postfix) with ESMTP id EF36E8D007F for ; Tue, 9 Mar 2021 07:33:10 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id B1B1C180AD81D for ; Tue, 9 Mar 2021 12:33:10 +0000 (UTC) X-FDA: 77900275740.09.109C1B5 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf21.hostedemail.com (Postfix) with ESMTP id AB67DE0011FA for ; Tue, 9 Mar 2021 12:33:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=xkCNI64hFClFy4A9TLVYq/KYOxo40opRSbt2HLqUIYI=; b=Se3SAc/ZLY0FrA6oLAdpuIW3+f V9P1eOmnPvp/uKVvPLAKKPF742Te8jOhuqQ+EuZ5f9JQ/h4MlVMV7vYecJLfbhYteFL5/MUhFfMdD ad5TuSkFbAGeTKiMsA0vPrdn6feG/YdiRgIs8T8GWZSzbbERJ2I83hks7QVqvrQW4MSD9Qe98KB4a uPGhy+A01HzP45BfE91NkC7FKMlgmXQwAL0FZL9xze9Pq2JlOM+lZ7kzkB7nHoW1ckQxHoCDWIJhJ 887WmPbD7l4CZ5yYGrRTjLqg744yNTLssOyyE3/bkF2/JolFsiKT2zx5Hu6PnH1YmC95ibxc4NuqB 9mk36XUg==; Received: from willy by casper.infradead.org with local (Exim 4.94 #2 (Red Hat Linux)) id 1lJbXj-000ZfQ-AC; Tue, 09 Mar 2021 12:32:55 +0000 Date: Tue, 9 Mar 2021 12:32:55 +0000 From: Matthew Wilcox To: Michal Hocko Cc: Zhou Guanghui , linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, hannes@cmpxchg.org, hughd@google.com, kirill.shutemov@linux.intel.com, npiggin@gmail.com, ziy@nvidia.com, wangkefeng.wang@huawei.com, guohanjun@huawei.com, dingtianhong@huawei.com, chenweilong@huawei.com, rui.xiang@huawei.com Subject: Re: [PATCH v2 2/2] mm/memcg: set memcg when split page Message-ID: <20210309123255.GI3479805@casper.infradead.org> References: <20210304074053.65527-1-zhouguanghui1@huawei.com> <20210304074053.65527-3-zhouguanghui1@huawei.com> <20210308210225.GF3479805@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: jxp98oe1rwf7hmp1yuuejumo4q1s6zjr X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: AB67DE0011FA Received-SPF: none (infradead.org>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=casper.infradead.org; client-ip=90.155.50.34 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1615293186-474934 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 09, 2021 at 10:02:00AM +0100, Michal Hocko wrote: > On Mon 08-03-21 21:02:25, Matthew Wilcox wrote: > > On Thu, Mar 04, 2021 at 07:40:53AM +0000, Zhou Guanghui wrote: > > > For example, when alloc_pages_exact is used to allocate 1MB continuous > > > physical memory, 2MB is charged(kmemcg is enabled and __GFP_ACCOUNT is > > > set). When make_alloc_exact free the unused 1MB and free_pages_exact > > > free the applied 1MB, actually, only 4KB(one page) is uncharged. > > @@ -5081,9 +5081,15 @@ void __free_pages(struct page *page, unsigned int order) > > { > > if (put_page_testzero(page)) > > free_the_page(page, order); > > - else if (!PageHead(page)) > > - while (order-- > 0) > > - free_the_page(page + (1 << order), order); > > + else if (!PageHead(page)) { > > + while (order-- > 0) { > > + struct page *tail = page + (1 << order); > > +#ifdef CONFIG_MEMCG > > + tail->memcg_data = page->memcg_data; > > +#endif > > + free_the_page(tail, order); > > + } > > + } > > } > > EXPORT_SYMBOL(__free_pages); > > Hmm, I was not aware of this code. This is really a tricky code. Yes. I only added it recently. I don't see a better way to solve this problem. We could turn the non-compound page into a compound page at this point, but I'm not sure that's really less tricky. > > I wonder if we shouldn't initialise memcg_data on all subsequent pages > > of non-compound allocations instead? Because I'm not sure this is the > > only place that needs to be fixed. > > That would be safer for sure. Do you mean this as a replacement to the > original patch? > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 913c2b9e5c72..d44dea2b8d22 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3135,8 +3135,21 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order) > if (memcg && !mem_cgroup_is_root(memcg)) { > ret = __memcg_kmem_charge(memcg, gfp, 1 << order); > if (!ret) { > + int nr_pages = 1 << order; > page->memcg_data = (unsigned long)memcg | > MEMCG_DATA_KMEM; > + > + /* > + * Compound pages are normally split or freed > + * via their head pages so memcg_data in in the > + * head page should be sufficient but there > + * are exceptions to the rule (see __free_pages). > + * Non compound pages would need to copy memcg anyway. > + */ > + for (i = 1; i < nr_pages; i++) { > + struct page * p = page + i; > + p->memcg_data = page->memcg_data > + } > return 0; I would condition this loop on if (!(gfp & __GFP_COMP)), but yes, something along these lines. I might phrase the comment a little differently ... /* * Compound pages are treated as a single unit, * but non-compound pages can be freed individually * so each page needs to have its memcg set to get * the accounting right. */