From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 954FFC4725D for ; Mon, 22 Jan 2024 10:13:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16AC36B0083; Mon, 22 Jan 2024 05:13:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F1CC6B0088; Mon, 22 Jan 2024 05:13:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED40B6B0089; Mon, 22 Jan 2024 05:13:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D9E366B0083 for ; Mon, 22 Jan 2024 05:13:03 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A9C70A1D0B for ; Mon, 22 Jan 2024 10:13:03 +0000 (UTC) X-FDA: 81706533846.19.7F4D256 Received: from out-179.mta1.migadu.com (out-179.mta1.migadu.com [95.215.58.179]) by imf10.hostedemail.com (Postfix) with ESMTP id E66C1C000F for ; Mon, 22 Jan 2024 10:13:01 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=rqG8CLzM; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf10.hostedemail.com: domain of gang.li@linux.dev designates 95.215.58.179 as permitted sender) smtp.mailfrom=gang.li@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705918382; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=i27gMbuhWRmHIea8xkCjO4LuE+uIyG5QpAJgAShJ0ac=; b=tMMNzdRB6CbfZxmMfcEIEk54E323oim/d7hX9C+TSAv/eIuZR/X5SZvuk2thZvHrlA2JF3 R9TW2Zn15uDm+oxZl3UskVH3LyTcf+suGkyzLj4W6ImlH7f26Lh6MBJiLFjtSDwcZON0cY WR3mgL2hz8HdpOcWCgVOhvObwEs4iLY= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=rqG8CLzM; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf10.hostedemail.com: domain of gang.li@linux.dev designates 95.215.58.179 as permitted sender) smtp.mailfrom=gang.li@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705918382; a=rsa-sha256; cv=none; b=OXm5Ns5KQYQRv4PHrmwDFkrv6WVURYaKN4VNQEnXVgheqRSJgRINPlc4k0F2UUseulH2ZR iLvV+Cf2N6eXn5zlP+T8y48hqnXZpQZifExl606ZQK/69pP9fK3tAZ6Ab9iSzgQM6fkmq+ YMyZ+UW3T3u5FJqWunJt5j69LbB9KjI= Message-ID: <14e38e95-2bc6-4571-b502-4e3954b4bcc4@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1705918380; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i27gMbuhWRmHIea8xkCjO4LuE+uIyG5QpAJgAShJ0ac=; b=rqG8CLzMUsuUXsV3JA4pRQc8Ub3ClmHd5tVqjaYG7Ibp098Ee3kWxKq2Hvj4lsINpjlvKV F2ijQLOkyKsLbynUFfXnzBSeyTmn2yJXBlqPbx3Ob+0YTp3Do52+fjHYAPDDL+1yWGQv0F dUrZkuUAzRYDpnudnzTxqQgGzFY9q30= Date: Mon, 22 Jan 2024 18:12:53 +0800 MIME-Version: 1.0 Subject: Re: [PATCH v4 6/7] hugetlb: parallelize 2M hugetlb allocation and initialization Content-Language: en-US To: Muchun Song , David Hildenbrand , David Rientjes , Mike Kravetz , Andrew Morton , Tim Chen Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, ligang.bdlg@bytedance.com References: <20240118123911.88833-1-gang.li@linux.dev> <20240118123911.88833-7-gang.li@linux.dev> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Gang Li In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: E66C1C000F X-Stat-Signature: jrr79umpdbfzrbs5qy3b3ucs4adz8w76 X-Rspam-User: X-HE-Tag: 1705918381-334513 X-HE-Meta: U2FsdGVkX19HNrXV38LixU1ThEcJUjJRsZj5radyjNC5IZDRBQ7k/xIIxHqjLG+jb2NxWUa/Utd4rruUO211BGrYSQw38cH6zR4DmlsbkeIEHo8GJ9OJIoQ+SkIpxnxnVFTY4zzmJ6uCbUU4i/9nbQ3cdT+tTXd1QFI4Q27DqOfnLB5jf2dAp7caenLmf8IqK3SrGCjhomRp5BlxQu1TmSTQMjZ4SVvx0kJ0sfvzv+PNGUhpuGJO/2SmjI+fSZiIbPrcCidA6iXHVo7b07SVW5t15Y0C1zlbK1EdksGksIdTXJnLCzizM+sPIZ+fZnUFefguCSPErj8a2EwsyKYa9RPGRX/g6PyMXtqPM3CaCNVFyzyX5ST9jcLlv9KLsmZ8/J0MKxf9RMW6iFqoubakQIVJa9FGgRK9pXactDYzdx3VsiXR7RU8NOZkJr6bgVJJ+47+EKBH2judkAfHfNalWvVy4+qiPhnq+YkNV00Oz6xh1G6uccW/NBNL6dDo2CZdEEaTWEGRNUZ45PiPhIjzFMNWf9sLmHA7K9hHs6aXmRg9gjID9BtRu/0H7JNmxZFkfrxKFHM+wXRQW6kPAIU2lcvHWBFVWCHAagcIf7oinzqwBaSKOIIic4LBiVgIOy/ZmLfmxyh5+mIqzQi2404w109owBpCP3NV1fjwN0NOkw+Is2DsChCszzl/T6kNDGBX7zsbQ+xp+htngUb9r3dBUSmurnlLIWmXcIXvlBFnat9RgrrKzaMrBmGvxoK5DU8q2c5D4GF5HzJ4f9aG8Hf65zVLqz1rSoi8e2qLkNODnbY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/1/22 15:10, Muchun Song wrote:> On 2024/1/18 20:39, Gang Li wrote: >> +static void __init hugetlb_alloc_node(unsigned long start, unsigned >> long end, void *arg) >>   { >> -    unsigned long i; >> +    struct hstate *h = (struct hstate *)arg; >> +    int i, num = end - start; >> +    nodemask_t node_alloc_noretry; >> +    unsigned long flags; >> +    int next_node = 0; > > This should be first_online_node which may be not zero. > That's right. Thanks! >> -    for (i = 0; i < h->max_huge_pages; ++i) { >> -        if (!alloc_bootmem_huge_page(h, NUMA_NO_NODE)) >> +    /* Bit mask controlling how hard we retry per-node allocations.*/ >> +    nodes_clear(node_alloc_noretry); >> + >> +    for (i = 0; i < num; ++i) { >> +        struct folio *folio = alloc_pool_huge_folio(h, >> &node_states[N_MEMORY], >> +                        &node_alloc_noretry, &next_node); >> +        if (!folio) >>               break; >> +        spin_lock_irqsave(&hugetlb_lock, flags); > > I suspect there will more contention on this lock when parallelizing. In the worst case, there are only 'numa node number' of threads in contention. And in my testing, it doesn't degrade performance, but rather improves performance due to the reduced granularity. > I want to know why you chose to drop prep_and_add_allocated_folios() > call in the original hugetlb_pages_alloc_boot()? Splitting him to parallelize hugetlb_vmemmap_optimize_folios. >> +static unsigned long __init hugetlb_pages_alloc_boot(struct hstate *h) >> +{ >> +    struct padata_mt_job job = { >> +        .fn_arg        = h, >> +        .align        = 1, >> +        .numa_aware    = true >> +    }; >> + >> +    job.thread_fn    = hugetlb_alloc_node; >> +    job.start    = 0; >> +    job.size    = h->max_huge_pages; >> +    job.min_chunk    = h->max_huge_pages / num_node_state(N_MEMORY) / 2; >> +    job.max_threads    = num_node_state(N_MEMORY) * 2; > > I am curious the magic number of 2 used in assignments of ->min_chunk > and ->max_threads, does it from your experiment? I thinke it should > be a comment here. > This is tested and I can perform more detailed tests and provide data. > And I am also sceptical about the optimization for a small amount of > allocation of hugepages. Given 4 hugepags needed to be allocated on UMA > system, job.min_chunk will be 2, job.max_threads will be 2. Then, 2 > workers will be scheduled, however each worker will just allocate 2 pages, > how much the cost of scheduling? What if allocate 4 pages in single > worker? Do you have any numbers on parallelism vs non-parallelism in > a small allocation case? If we cannot gain from this case, I think we shold > assign a reasonable value to ->min_chunk based on experiment. > > Thanks. > That's a good suggestion, I'll run some tests and choose the best values.