From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB2C83C3E for ; Mon, 8 Aug 2022 15:55:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659974109; x=1691510109; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=adBjDflubgueY7hGEhT8HFxQtuN3PdKVUrGiUqSnREU=; b=ZKY3sE+voQDRg8jaDisg2APrpdcyM+mavKdPhIvrPUjN1XyIjo1edLeA m+e1yhKdZLDauRs3Vaq79FGFXNEVL6bkdybRMFrRsNMzVD+abhnVwvXSm mCzsZ55OYIQZ5b0+S9sLDMHyARrLoNdcsyPgBxVyxiVe1EBPeUKfzRlcX JOfvUW6nTljMZfbVlAf8gYimeQ2hwQX8/aCviKCNWU2esZxKAWCgaRzbt Ymh38kZOGc9cC8MN86JtNza+dQC/0yA0o/+K95jQmFO5lKa04deofN2sx gwXWSfzgsKnlps0ktQZdCmcvIeAXYmJNdREU/zaAp6GqJiKLPlATHSJ7p w==; X-IronPort-AV: E=McAfee;i="6400,9594,10433"; a="270402287" X-IronPort-AV: E=Sophos;i="5.93,222,1654585200"; d="scan'208";a="270402287" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2022 08:55:09 -0700 X-IronPort-AV: E=Sophos;i="5.93,222,1654585200"; d="scan'208";a="604416918" Received: from sankarka-mobl1.amr.corp.intel.com (HELO [10.212.251.15]) ([10.212.251.15]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2022 08:55:08 -0700 Message-ID: <4992c5b3-b9a9-b4ff-b09c-1383faf1ea6f@intel.com> Date: Mon, 8 Aug 2022 08:55:09 -0700 Precedence: bulk X-Mailing-List: linux-coco@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCHv7 02/14] mm: Add support for unaccepted memory Content-Language: en-US To: Vlastimil Babka , David Hildenbrand , "Kirill A. Shutemov" , Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, Mike Rapoport , Mel Gorman References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> <20220614120231.48165-3-kirill.shutemov@linux.intel.com> <8cf143e7-2b62-1a1e-de84-e3dcc6c027a4@suse.cz> From: Dave Hansen In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 8/5/22 11:17, Vlastimil Babka wrote: >> 3. Pull the page off the 2M/4M lists, drop the zone lock, accept it, >> then put it back. > Worth trying, IMHO. Perhaps easier to manage if the lists are distinct from > the normal ones, as I suggested. I was playing with another series recently where I did this, momentarily taking pages off some of the high-order lists and dropping the zone lock. Kirill, if you go looking at this, just make sure that you don't let this happen to too much memory at once. You might end up yanking memory out of the allocator that's not reflected in NR_FREE_PAGES. You might, for instance want to make sure that only a small number of threads can have pulled memory off the free lists at once. Something *logically* like this: // Limit to two threads at once: atomic_t nr_accepting_threads = ATOMIC_INIT(2); page = del_page_from_free_list(); if (!PageAccepted(page)) { if (atomic_dec_and_test(&nr_accepting_threads)) { // already at the thread limit add_page_from_free_list(page, ...); spin_unlock_irq(&zone->lock); // wait for a slot... spin_lock_irq(&zone->lock); goto retry; } else { spin_unlock_irq(&zone->lock); accept_page(page); spin_lock_irq(&zone->lock); add_page_from_free_list(page, ...); // do merging if it was a 2M page } } It's a little nasty because the whole thing is not a sleepable context. I also know that the merging code needs some refactoring if you want to do merging with 2M pages here. It might all get easier if you move all the page allocator stuff to only work at the 4M granularity. In any case, I'm not trying to dissuade anyone from listening to the other reviewer feedback. Just trying to save you a few cycles on a similar problem I was looking at recently.