From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A09DCC433DF for ; Wed, 19 Aug 2020 03:18:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 61655208B3 for ; Wed, 19 Aug 2020 03:18:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="q4eHZB3j" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 61655208B3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E77EA6B006C; Tue, 18 Aug 2020 23:18:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E284A6B006E; Tue, 18 Aug 2020 23:18:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3E9A6B0070; Tue, 18 Aug 2020 23:18:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id BF2166B006C for ; Tue, 18 Aug 2020 23:18:19 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7727E180AD807 for ; Wed, 19 Aug 2020 03:18:19 +0000 (UTC) X-FDA: 77165859918.26.bone59_0b08a2a27024 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id 42E731804B676 for ; Wed, 19 Aug 2020 03:18:19 +0000 (UTC) X-HE-Tag: bone59_0b08a2a27024 X-Filterd-Recvd-Size: 4136 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Wed, 19 Aug 2020 03:18:18 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id AD21E20639; Wed, 19 Aug 2020 03:18:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1597807097; bh=N1Hg73ubDcpjGJDJGT5jZegR8KHLiNUqoVxQYzYXmq0=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=q4eHZB3jmK/o9r6C1qNxrrxqLIH78fh4WBV77DiWbC19R0kSzYG3STISlPhVNbmiK i6Zs6UNH1nD2G0nchzqbdrLDLucUm9JyiI11URqvXFlwwvA1fF26qvrYztOHNFlB5k YUM70z4OfHIlxGcPCAehnPaV2ZnIMGbBe8cOcrU0= Date: Tue, 18 Aug 2020 20:18:17 -0700 From: Andrew Morton To: Doug Berger Cc: Jason Baron , David Rientjes , "Kirill A. Shutemov" , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] mm: include CMA pages in lowmem_reserve at boot Message-Id: <20200818201817.351499e75cba2a84e8bf33e6@linux-foundation.org> In-Reply-To: <1597423766-27849-1-git-send-email-opendmb@gmail.com> References: <1597423766-27849-1-git-send-email-opendmb@gmail.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 42E731804B676 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 14 Aug 2020 09:49:26 -0700 Doug Berger wrote: > The lowmem_reserve arrays provide a means of applying pressure > against allocations from lower zones that were targeted at > higher zones. Its values are a function of the number of pages > managed by higher zones and are assigned by a call to the > setup_per_zone_lowmem_reserve() function. > > The function is initially called at boot time by the function > init_per_zone_wmark_min() and may be called later by accesses > of the /proc/sys/vm/lowmem_reserve_ratio sysctl file. > > The function init_per_zone_wmark_min() was moved up from a > module_init to a core_initcall to resolve a sequencing issue > with khugepaged. Unfortunately this created a sequencing issue > with CMA page accounting. > > The CMA pages are added to the managed page count of a zone > when cma_init_reserved_areas() is called at boot also as a > core_initcall. This makes it uncertain whether the CMA pages > will be added to the managed page counts of their zones before > or after the call to init_per_zone_wmark_min() as it becomes > dependent on link order. With the current link order the pages > are added to the managed count after the lowmem_reserve arrays > are initialized at boot. > > This means the lowmem_reserve values at boot may be lower than > the values used later if /proc/sys/vm/lowmem_reserve_ratio is > accessed even if the ratio values are unchanged. > > In many cases the difference is not significant, but for example > an ARM platform with 1GB of memory and the following memory layout > [ 0.000000] cma: Reserved 256 MiB at 0x0000000030000000 > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000000000000-0x000000002fffffff] > [ 0.000000] Normal empty > [ 0.000000] HighMem [mem 0x0000000030000000-0x000000003fffffff] > > would result in 0 lowmem_reserve for the DMA zone. This would allow > userspace to deplete the DMA zone easily. Sounds fairly serious for thos machines. Was a cc:stable considered? > Funnily enough > $ cat /proc/sys/vm/lowmem_reserve_ratio > would fix up the situation because it forces > setup_per_zone_lowmem_reserve as a side effect. > > This commit breaks the link order dependency by invoking > init_per_zone_wmark_min() as a postcore_initcall so that the > CMA pages have the chance to be properly accounted in their > zone(s) and allowing the lowmem_reserve arrays to receive > consistent values. >