From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 619A4C55179 for ; Tue, 27 Oct 2020 15:29:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1008922264 for ; Tue, 27 Oct 2020 15:29:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1008922264 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5B9136B0073; Tue, 27 Oct 2020 11:29:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 567206B0074; Tue, 27 Oct 2020 11:29:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42F256B0075; Tue, 27 Oct 2020 11:29:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0082.hostedemail.com [216.40.44.82]) by kanga.kvack.org (Postfix) with ESMTP id 087B16B0073 for ; Tue, 27 Oct 2020 11:29:07 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 9FBF28249980 for ; Tue, 27 Oct 2020 15:29:07 +0000 (UTC) X-FDA: 77418088734.14.lead06_1d0a65b2727d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id 7612518229837 for ; Tue, 27 Oct 2020 15:29:07 +0000 (UTC) X-HE-Tag: lead06_1d0a65b2727d X-Filterd-Recvd-Size: 5287 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Tue, 27 Oct 2020 15:29:06 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 8A00FADD9; Tue, 27 Oct 2020 15:29:05 +0000 (UTC) Date: Tue, 27 Oct 2020 16:29:02 +0100 From: Oscar Salvador To: Dave Hansen Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, yang.shi@linux.alibaba.com, rientjes@google.com, ying.huang@intel.com, dan.j.williams@intel.com Subject: Re: [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim Message-ID: <20201027152858.GA11135@linux> References: <20201007161736.ACC6E387@viggo.jf.intel.com> <20201007161745.26B1D789@viggo.jf.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201007161745.26B1D789@viggo.jf.intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Oct 07, 2020 at 09:17:45AM -0700, Dave Hansen wrote: > Signed-off-by: Dave Hansen > Cc: Yang Shi > Cc: David Rientjes > Cc: Huang Ying > Cc: Dan Williams I am still going through all the details, but just my thoughts on things that caught my eye: > --- a/include/linux/migrate.h~demote-with-migrate_pages 2020-10-07 09:15:31.028642442 -0700 > +++ b/include/linux/migrate.h 2020-10-07 09:15:31.034642442 -0700 > @@ -27,6 +27,7 @@ enum migrate_reason { > MR_MEMPOLICY_MBIND, > MR_NUMA_MISPLACED, > MR_CONTIG_RANGE, > + MR_DEMOTION, > MR_TYPES I think you also need to add it under include/trace/events/migrate.h, so mm_migrate_pages event can know about it. > +bool migrate_demote_page_ok(struct page *page, struct scan_control *sc) Make it static? Also, scan_control seems to be unused here. > +{ > + int next_nid = next_demotion_node(page_to_nid(page)); > + > + VM_BUG_ON_PAGE(!PageLocked(page), page); Right after the call to migrate_demote_page_ok, we call unlock_page which already has this check in place. I know that this is only to be on the safe side and we do not loss anything, but just my thoughts. > +static struct page *alloc_demote_page(struct page *page, unsigned long node) > +{ > + /* > + * Try to fail quickly if memory on the target node is not > + * available. Leaving out __GFP_IO and __GFP_FS helps with > + * this. If the desintation node is full, we want kswapd to > + * run there so that its pages will get reclaimed and future > + * migration attempts may succeed. > + */ > + gfp_t flags = (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_NORETRY | > + __GFP_NOMEMALLOC | __GFP_NOWARN | __GFP_THISNODE | > + __GFP_KSWAPD_RECLAIM); I think it would be nicer to have this as a real GFP_ thingy defined. e.g: GFP_DEMOTION > + /* HugeTLB pages should not be on the LRU */ > + WARN_ON_ONCE(PageHuge(page)); I am not sure about this one. This could only happen if the page, which now it is in another list, ends up in the buddy system. That is quite unlikely bth. And nevertheless, this is only a warning, which means that if this scenario gets to happen, we will be allocating a single page to satisfy a higher-order page, and I am not sure about the situation we will end up with. > + > + if (PageTransHuge(page)) { > + struct page *thp; > + > + flags |= __GFP_COMP; > + > + thp = alloc_pages_node(node, flags, HPAGE_PMD_ORDER); > + if (!thp) > + return NULL; > + prep_transhuge_page(thp); > + return thp; > + } > + > + return __alloc_pages_node(node, flags, 0); Would make sense to transform this in some sort of new_demotion_page, which actually calls alloc_migration_target with the right stuff in place? And then pass a struct migration_target_control so alloc_migration_target does the right thing. alloc_migration_target also takes care of calling prep_transhuge_page when needed. e.g: static struct page *new_demotion_node(struct page *page, unsigned long private) { struct migration_target_control mtc = { .nid = private, .gfp_mask = GFP_DEMOTION, }; if (PageTransHuge(page)) mtc.gfp_mask |= __GFP_COMP; return alloc_migration_target(page, (unsigned long)&mtc); } The only thing I see is that alloc_migration_target seems to "override" the gfp_mask and does ORs GFP_TRANSHUGE for THP pages, which includes __GFP_DIRECT_RECLAIM (not appreciated in this case). But maybe this can be worked around by checking if gfp_mask == GFP_DEMOTION, and if so, just keep the mask as it is. > + > + if (list_empty(demote_pages)) > + return 0; > + > + /* Demotion ignores all cpuset and mempolicy settings */ > + err = migrate_pages(demote_pages, alloc_demote_page, NULL, > + target_nid, MIGRATE_ASYNC, MR_DEMOTION, > + &nr_succeeded); As I said, instead of alloc_demote_page, use a new_demote_page and make alloc_migration_target handle the allocations and prep thp pages. -- Oscar Salvador SUSE L3