From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7519EB64D9 for ; Wed, 12 Jul 2023 09:18:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A8096B0071; Wed, 12 Jul 2023 05:18:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 331C46B0072; Wed, 12 Jul 2023 05:18:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D1AE6B0075; Wed, 12 Jul 2023 05:18:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 072CC6B0071 for ; Wed, 12 Jul 2023 05:18:52 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B9C8E120201 for ; Wed, 12 Jul 2023 09:18:51 +0000 (UTC) X-FDA: 81002410062.29.ED45F4B Received: from outbound-smtp06.blacknight.com (outbound-smtp06.blacknight.com [81.17.249.39]) by imf06.hostedemail.com (Postfix) with ESMTP id B841B18000F for ; Wed, 12 Jul 2023 09:18:49 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf06.hostedemail.com: domain of mgorman@techsingularity.net designates 81.17.249.39 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689153530; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vI9yvNrSHfYU4paf9/mQdvt6O/OotNrMm4HcTqJdzMg=; b=D5GfiJrZ/0wcu0SNrUbTrgKZNKzQ0+KlWR8FZWPnhl3lIqG7HEShFMviTq62smMZk9lvnE n/t7iViOKcmKlSM68DpGuQ+JKWthSne6osTMcB3fn5C9me3MGZvHjSxu+dZK18hE53Ewuj SyDwH+Lg5bcFoOQQB2KSK8Vkz39ar0Y= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf06.hostedemail.com: domain of mgorman@techsingularity.net designates 81.17.249.39 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689153530; a=rsa-sha256; cv=none; b=tmQXftTfJtWT7qUyfIy4gAP+uol/JoYUqCR76TUFBEl7ERqXFm8oqQ42hC7qwHM0n4OcBp jHltS6lWwvms6k2PdSXnAYVMeXoFYOaiKGbZ93LFQozupCmUa1fGzGaNNfQpYQZQI9L8NR al0l7Pd9T/0QzVRscWPW8mqzQxWbzes= Received: from mail.blacknight.com (pemlinmail01.blacknight.ie [81.17.254.10]) by outbound-smtp06.blacknight.com (Postfix) with ESMTPS id B7BB3C2AD7 for ; Wed, 12 Jul 2023 10:18:47 +0100 (IST) Received: (qmail 30399 invoked from network); 12 Jul 2023 09:18:47 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.21.103]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 12 Jul 2023 09:18:47 -0000 Date: Wed, 12 Jul 2023 10:18:45 +0100 From: Mel Gorman To: "Kirill A. Shutemov" Cc: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel , Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCHv14 5/9] efi: Add unaccepted memory support Message-ID: <20230712091845.hahda3xgvegv5hgf@techsingularity.net> References: <20230606142637.5171-1-kirill.shutemov@linux.intel.com> <20230606142637.5171-6-kirill.shutemov@linux.intel.com> <20230703132518.3ukqyolnes47i5r3@techsingularity.net> <20230704143740.bgimyg3bqsgxbm47@box.shutemov.name> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20230704143740.bgimyg3bqsgxbm47@box.shutemov.name> X-Rspamd-Queue-Id: B841B18000F X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 5pqtgwcgc8p5kbzusm4cb1bdhz6ogid8 X-HE-Tag: 1689153529-632859 X-HE-Meta: U2FsdGVkX1+dR/+fRBgqaMjUaaFgsw85Mw7Ffeann2Ot+QJWef4A9iN5tBhlyi5zumWuo/5q8mekRHElxer8ns3KoLWnlfsTVxQ+NuvDLnbMimPYg7U5gdIrzMmSjHKbonE6wlq0BySQvgLtoujmNLJaNYSaK65ZaBCLimtZZ4eVFiFQnqotChR2RvpsTtnJJwYzG3uuz/iMCs9+MCxO3TfEJOitdjUKvB8gEBHy8S6qrZwo4z46/tp5EA3jtQRVHiroe1695Z3Tj6MbiLwN9k0vfYRm3rgDga+IkkY7boI/6g06ocRUxmvTU6KZK0jxB3aHDT9su9pJHraBfWZtV7gv8HHCXYO/MAREnIggRkYZ0xJODiK95CJwQh9XPtrpL/eK0ErWAGUHs17k1lWUbEST2ZUPC+l8RONNmZGr73ms6kv9Uu85fSc6q6fVyVVuN5sa6OwTdtvRwkoVEbnTO65VyLiApqvH3MtyPCm18cpPl8DI+SfLzmRpBfmPw4PRUbGk9m80liNWgmr2yikURa1CdNx6G2WOd1GMGnX1Od+hGNK6Srgrk8YZVeHeBN1cjtLqpWDDjnT0pyrkhVcWKrSx/Qi7PqpuR/T34mIlXnTXkWKjWHWsyK4wgWJMYBf7gsyPcgEbSO49ttDM751xiOmlR2+9soRshgYYbWf5xMdHY4/rM/xCH508nLW9as1hK0oCE98RypTfgGRpgSLz0566GKlfKmGqtT7fo+2dxR0qGckgXuAkZYHNG++jvJ6prs8znv1C0xhPuo1ovqRoF+eZXXKKg+vtSm0webwra7HbyTIz3aBQGHRRmxkqvAljZsusFgvK4rUNmRSEQYFdAkSk91sHs8+ROZD32G6llZ/KHZnms4ehi8llpruy9pEQlwTJbSr0YO8Z9urY9KgSfgyYXlhOXMqmvwS5wkuTWHqJkixYtbDmjGDZVWklbS4iVRZJb0RO4sU2fTReXuc rikQM/MP ZNoMK9YDNdw3X49i53X84cU2GqiGAhWPrzy7kgiD9AKKaXpHV97rnIu+K0yext9Y/UGZf2fnSEHVBUD18sSarDWLIjy3PUu8bF6xmKvpVZnpfiQrIIyGj2Y0gDb060LTOag2HlVfWGJ5GYFbImHblbv+hCOEayMhK10P4/KKYbQ033wEGcvJykbL1kPkZd0Gix1mYKZmehLTCT0fPdTiikw45v8Ig/pkG5snl3rRHk28d4W1Vd2ZgQdqtCqvTmRGAIY/niQbDxMjfhquF6zMX1WRzMQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 04, 2023 at 05:37:40PM +0300, Kirill A. Shutemov wrote: > On Mon, Jul 03, 2023 at 02:25:18PM +0100, Mel Gorman wrote: > > On Tue, Jun 06, 2023 at 05:26:33PM +0300, Kirill A. Shutemov wrote: > > > efi_config_parse_tables() reserves memory that holds unaccepted memory > > > configuration table so it won't be reused by page allocator. > > > > > > Core-mm requires few helpers to support unaccepted memory: > > > > > > - accept_memory() checks the range of addresses against the bitmap and > > > accept memory if needed. > > > > > > - range_contains_unaccepted_memory() checks if anything within the > > > range requires acceptance. > > > > > > Architectural code has to provide efi_get_unaccepted_table() that > > > returns pointer to the unaccepted memory configuration table. > > > > > > arch_accept_memory() handles arch-specific part of memory acceptance. > > > > > > Signed-off-by: Kirill A. Shutemov > > > Reviewed-by: Ard Biesheuvel > > > Reviewed-by: Tom Lendacky > > > > By and large, this looks ok from the page allocator perspective as the > > checks for unaccepted are mostly after watermark checks. However, if you > > look in the initial fast path, you'll see this > > > > /* > > * Forbid the first pass from falling back to types that fragment > > * memory until all local zones are considered. > > */ > > alloc_flags |= alloc_flags_nofragment(ac.preferred_zoneref->zone, gfp); > > > > While checking watermarks should be fine from a functional perspective and > > the fast paths are unaffected, there is a risk of premature fragmentation > > until all memory has been accepted. Meeting watermarks does not necessarily > > mean that fragmentation is avoided as pageblocks can get mixed while still > > meeting watermarks. > > Could you elaborate on this scenario? > > Current code checks the watermark, if it is met, try rmqueue(). > > If rmqueue() fails anyway, try to accept more pages and retry the zone if > it is successful. > > I'm not sure how we can get to the 'if (no_fallback) {' case with any > unaccepted memory in the allowed zones. > Lets take an extreme example and assume that the low watermark is lower than 2MB (one pageblock). Just before the watermark is reached (free count between 1MB and 2MB), it is unlikely that all free pages are within pageblocks of the same migratetype (e.g. MIGRATE_MOVABLE). If there is an allocation near the watermark of a different type (e.g. MIGRATE_UNMOVABLE) then the page allocation could fallback to a different pageblock and now it is mixed. It's a condition that is only obvious if you are explicitly checking for it via tracepoints. This can happen in the normal case, but unaccepted memory makes it worse because the "pageblock mixing" could have been avoided if the "no_fallback" case accepted at least one new pageblock instead of mixing pageblocks. That is an extreme example but the same logic applies when the free count is at or near MIGRATE_TYPES*pageblock_nr_pages as it is not guaranteed that the pageblocks with free pages are a migratetype that matches the allocation request. Hence, it may be more robust from a fragmentation perspective if ALLOC_NOFRAGMENT requests accept memory if it is available and retries before clearing ALLOC_NOFRAGMENT and mixing pageblocks before the watermarks are reached. > I see that there's preferred_zoneref and spread_dirty_pages cases, but > unaccepted memory seems change nothing for them. > preferred_zoneref is about premature zone exhaustion and spread_dirty_pages is about avoiding premature stalls on a node/zone due to an imbalance in the number of pages waiting for writeback to complete. There is an arguement to be made that they also should accept memory but it's less clear how much of a problem this is. Both are very obvious when they "fail" and likely are covered by the existing watermark checks. Premature pageblock mixing is more subtle as the final impact (root cause of a premature THP allocation failure) is harder to detect. -- Mel Gorman SUSE Labs