From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34D26C433EF for ; Sat, 16 Oct 2021 16:46:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C375761139 for ; Sat, 16 Oct 2021 16:46:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C375761139 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 378AD6B0071; Sat, 16 Oct 2021 12:46:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 32913900002; Sat, 16 Oct 2021 12:46:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F01C6B0073; Sat, 16 Oct 2021 12:46:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0107.hostedemail.com [216.40.44.107]) by kanga.kvack.org (Postfix) with ESMTP id 0E4286B0071 for ; Sat, 16 Oct 2021 12:46:30 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C0A3C181CC1DA for ; Sat, 16 Oct 2021 16:46:29 +0000 (UTC) X-FDA: 78702878898.21.85E89FA Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com [209.85.167.46]) by imf03.hostedemail.com (Postfix) with ESMTP id 3EA9330000A4 for ; Sat, 16 Oct 2021 16:46:28 +0000 (UTC) Received: by mail-lf1-f46.google.com with SMTP id n8so55928129lfk.6 for ; Sat, 16 Oct 2021 09:46:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:date:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=MHi8iUm6z8exyNOhpdqOYyCJmhXx9wPOdYtdzWSL6CY=; b=YfaNYvXft4KDDtZcv5w1GfZj/jYJaKtSsB5G/kB51tMwXR4Bbsz9trsYlKblXH1c8K Qwj1s5+0lRC8vnRd9IPO9ABjhfrT2IzY1p1qluxd0zIqoj+sp83azCyXdcMrjbuLh6D2 dLDHHfZLR9balnpnbLZN+l3qT3qh0iK6RFgBLTFePbSr6D0rYIZHzWeL/T4mbmKXdrz0 ciBAGYZ2jPknr5MD3qlAh69OgE+Hj7CRCm8ieXFwQ92PaLUtBKwXGfhpix9cUP2tdcax 1aZsYqr907t0JdtKwzeYbPm8Pwj0flbZ8p57QmfutzQlxV3YPg7vj554Gv/OFLvAULqL 1EGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:date:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=MHi8iUm6z8exyNOhpdqOYyCJmhXx9wPOdYtdzWSL6CY=; b=oNX3mtnzNWaZJw1sLauf0UuMJAw5Y1RmGy9pgt5ZLej5Jk6E4FkruuEBzTOFyXGa7h zrwYMDEn+7/bzlzXBOCA4p/od6Ossm9iWwVLZWXerBclxjYRjgVIGseEiADyXsfIR9B9 wXT2qG8AjMo+TzHF1N5yhM9hRIRvxGWj/EmEgA9ipXfWudSPSx9n0V6LQ42FxterWBAF fzXh5TzfqMiemgVJ13m/McoZ4kymKDEgrHQlzjxyTEEesm2GjniJRUHxXqTfOKsxLGZe GagLiSEUl4PXepGhHXOOJ0iEo901LlBxbnCO1i+ZFjQya8xOHtepbwPq+4LArmqVTBd9 V47g== X-Gm-Message-State: AOAM532JuW0QwTDj/nakcDmfgfJHvxIKbOxlqTJvTyjHemhgXLu+vPta Dq85OEMaynah7ysJO2LgjkqWjB8g6VLfOg== X-Google-Smtp-Source: ABdhPJxoaeGlHU9K+JrScNAuAtLsFAHLn7whEqYfdGxLtkzUr887+e6GhcIZkDmXEtoUJ9X3Ehb0Kg== X-Received: by 2002:a2e:8746:: with SMTP id q6mr20254963ljj.84.1634402787968; Sat, 16 Oct 2021 09:46:27 -0700 (PDT) Received: from pc638.lan (h5ef52e3d.seluork.dyn.perspektivbredband.net. [94.245.46.61]) by smtp.gmail.com with ESMTPSA id j18sm898491lfu.84.2021.10.16.09.46.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 Oct 2021 09:46:26 -0700 (PDT) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Sat, 16 Oct 2021 18:46:24 +0200 To: Nicholas Piggin , Chen Wandun Cc: Chen Wandun , Shakeel Butt , Andrew Morton , Eric Dumazet , guohanjun@huawei.com, LKML , Linux MM , Kefeng Wang Subject: Re: [PATCH] mm/vmalloc: fix numa spreading for large hash tables Message-ID: <20211016164624.GA1932@pc638.lan> References: <20210928121040.2547407-1-chenwandun@huawei.com> <8fc5e1ae-a356-6225-2e50-cf0e5ee26208@huawei.com> <1634261360.fed2opbgxw.astroid@bobo.none> <1634281763.ecsq6l88ia.astroid@bobo.none> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1634281763.ecsq6l88ia.astroid@bobo.none> User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 3EA9330000A4 X-Stat-Signature: 7psy47gqeckhdx1cx4iamzkgr7keatua Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=YfaNYvXf; spf=pass (imf03.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.46 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-HE-Tag: 1634402788-696699 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Oct 15, 2021 at 05:11:25PM +1000, Nicholas Piggin wrote: > Excerpts from Chen Wandun's message of October 15, 2021 12:31 pm: > >=20 > >=20 > > =E5=9C=A8 2021/10/15 9:34, Nicholas Piggin =E5=86=99=E9=81=93: > >> Excerpts from Chen Wandun's message of October 14, 2021 6:59 pm: > >>> > >>> > >>> =E5=9C=A8 2021/10/14 5:46, Shakeel Butt =E5=86=99=E9=81=93: > >>>> On Tue, Sep 28, 2021 at 5:03 AM Chen Wandun wrote: > >>>>> > >>>>> Eric Dumazet reported a strange numa spreading info in [1], and f= ound > >>>>> commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") int= roduced > >>>>> this issue [2]. > >>>>> > >>>>> Dig into the difference before and after this patch, page allocat= ion has > >>>>> some difference: > >>>>> > >>>>> before: > >>>>> alloc_large_system_hash > >>>>> __vmalloc > >>>>> __vmalloc_node(..., NUMA_NO_NODE, ...) > >>>>> __vmalloc_node_range > >>>>> __vmalloc_area_node > >>>>> alloc_page /* because NUMA_NO_NODE, so choo= se alloc_page branch */ > >>>>> alloc_pages_current > >>>>> alloc_page_interleave /* can be pro= ved by print policy mode */ > >>>>> > >>>>> after: > >>>>> alloc_large_system_hash > >>>>> __vmalloc > >>>>> __vmalloc_node(..., NUMA_NO_NODE, ...) > >>>>> __vmalloc_node_range > >>>>> __vmalloc_area_node > >>>>> alloc_pages_node /* choose nid by nuam_mem_= id() */ > >>>>> __alloc_pages_node(nid, ....) > >>>>> > >>>>> So after commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappi= ngs"), > >>>>> it will allocate memory in current node instead of interleaving a= llocate > >>>>> memory. > >>>>> > >>>>> [1] > >>>>> https://lore.kernel.org/linux-mm/CANn89iL6AAyWhfxdHO+jaT075iOa3Xc= Yn9k6JJc7JR2XYn6k_Q@mail.gmail.com/ > >>>>> > >>>>> [2] > >>>>> https://lore.kernel.org/linux-mm/CANn89iLofTR=3DAK-QOZY87RdUZENCZ= UT4O6a0hvhu3_EwRMerOg@mail.gmail.com/ > >>>>> > >>>>> Fixes: 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") > >>>>> Reported-by: Eric Dumazet > >>>>> Signed-off-by: Chen Wandun > >>>>> --- > >>>>> mm/vmalloc.c | 33 ++++++++++++++++++++++++++------- > >>>>> 1 file changed, 26 insertions(+), 7 deletions(-) > >>>>> > >>>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c > >>>>> index f884706c5280..48e717626e94 100644 > >>>>> --- a/mm/vmalloc.c > >>>>> +++ b/mm/vmalloc.c > >>>>> @@ -2823,6 +2823,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid, > >>>>> unsigned int order, unsigned int nr_pages, stru= ct page **pages) > >>>>> { > >>>>> unsigned int nr_allocated =3D 0; > >>>>> + struct page *page; > >>>>> + int i; > >>>>> > >>>>> /* > >>>>> * For order-0 pages we make use of bulk allocator, if > >>>>> @@ -2833,6 +2835,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, > >>>>> if (!order) { > >>>> > >>>> Can you please replace the above with if (!order && nid !=3D NUMA_= NO_NODE)? > >>>> > >>>>> while (nr_allocated < nr_pages) { > >>>>> unsigned int nr, nr_pages_request; > >>>>> + page =3D NULL; > >>>>> > >>>>> /* > >>>>> * A maximum allowed request is hard-co= ded and is 100 > >>>>> @@ -2842,9 +2845,23 @@ vm_area_alloc_pages(gfp_t gfp, int nid, > >>>>> */ > >>>>> nr_pages_request =3D min(100U, nr_pages= - nr_allocated); > >>>>> > >>>> > >>>> Undo the following change in this if block. > >>> > >>> Yes, It seem like more simpler as you suggested, But it still have > >>> performance regression, I plan to change the following to consider > >>> both mempolcy and alloc_pages_bulk. > >>=20 > >> Thanks for finding and debugging this. These APIs are a maze of twis= ty > >> little passages, all alike so I could be as confused as I was when I > >> wrote that patch, but doesn't a minimal fix look something like this= ? > >=20 > > Yes, I sent a patch=EF=BC=8Cit looks like as you show, besides it als= o > > contains some performance optimization. > >=20 > > [PATCH] mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to=20 > > accelerate memory allocation >=20 > Okay. It would be better to do it as two patches. First the minimal fix= =20 > so it can be backported easily and have the Fixes: tag pointed at my=20 > commit. Then the performance optimization. >=20 It is not only your commit. It also fixes my one :) commit 5c1f4e690eecc795b2e4d4408e87302040fceca4 Author: Uladzislau Rezki (Sony) Date: Mon Jun 28 19:40:14 2021 -0700 mm/vmalloc: switch to bulk allocator in __vmalloc_area_node() I agree there should be two separate patches which fix NUMA balancing issue, tagged with "Fixes" flag. One is located here: https://lkml.org/lkml/2021/10/15/1172 second one should be that fixes a second place where "big" pages are allocated, basically your patch. -- Vlad Rezki