From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95DC6C5320E for ; Tue, 20 Aug 2024 08:11:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E41B56B007B; Tue, 20 Aug 2024 04:11:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DF1DE6B0082; Tue, 20 Aug 2024 04:11:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB9846B0083; Tue, 20 Aug 2024 04:11:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id AF42E6B007B for ; Tue, 20 Aug 2024 04:11:38 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0DD7041865 for ; Tue, 20 Aug 2024 08:11:38 +0000 (UTC) X-FDA: 82471904676.02.BBD900B Received: from mail-vk1-f170.google.com (mail-vk1-f170.google.com [209.85.221.170]) by imf21.hostedemail.com (Postfix) with ESMTP id 43AED1C0004 for ; Tue, 20 Aug 2024 08:11:36 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PVH0ZIan; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.170 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724141408; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d798u9gz+fV/mIDRL/mYw7YhZt+WDnqYTUXk6WZw9ZY=; b=0zMZ+8+ymbXcZ/+SU+9x0okCkC495MIldrN8sZJV/FsEKuiZndMH4Klm6VHqkA0X8HheZW kf38hk68XS28GtjCK2ZAqA58wtGRMpTzHQzEGD8HRF8laJHGWUqQT7Oyl9TQ0CxjvNY0BO kBrY+ygawrpMxCzT1AwELhxbQbQEm6c= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724141408; a=rsa-sha256; cv=none; b=JghcrsilqYRw25871TdSwI92Kx+/cRMFUQbWoWQyB8rjn1YA53+GkgLRgviigsG+iH/0N8 uYi34K0DnO/2w8VI7/0V1xrzJZ4YKGCKkiSJmnWVQkKoq8CuZk+gmnkW0qc/f9rNVRvD61 itWYhpy/YqzPwjuTeNJptb9aqcJLoFM= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PVH0ZIan; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.170 as permitted sender) smtp.mailfrom=21cnbao@gmail.com Received: by mail-vk1-f170.google.com with SMTP id 71dfb90a1353d-4fce6fd54ebso77477e0c.1 for ; Tue, 20 Aug 2024 01:11:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724141495; x=1724746295; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=d798u9gz+fV/mIDRL/mYw7YhZt+WDnqYTUXk6WZw9ZY=; b=PVH0ZIan+rjLe2tf2r7N2m+pWVTo/vOinqvxvDe+jGpOasDg/bry3kY02Wy1DhJWls RmkI4FeJzJglxKozeuNLHhspmqQOV1f7H01/gci09CdIb8XXYcewD/i5Ge3ZG5eR/LPv w2TVRY7N949AXgKDsquyfvcTwZau0S6b7y67ubIjM4h0fURpxX07sWeJh6rdVILo9e8l 5dm/2PMmuE3sVtGei9FSC8Uu2HACi+elw4CKUbbK+ZDjp1BGnNy1m7GW+5yvBCClnNK0 dXisOLWTdYPHbFj3+b0+LP8gqc2x3uBYzCXgYtvZj/Z5WZaf0M8XW0bYjR12tbuzJEvZ vWNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724141495; x=1724746295; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d798u9gz+fV/mIDRL/mYw7YhZt+WDnqYTUXk6WZw9ZY=; b=Bf6byFlx56wFdRAaXiZYD/8KN6jmUCCq+12JZrTTV35taK7w84OkYjz6zAn4q+ETU7 OUE38cut+WUsv2nMcpq8nD8sFYBtdVhwIyZCrX4lStT+pjR82MyGH8nnFKRqDD7dEzpG IjdUWGsbqmQ1b8qJ9KHDALyD9p/K+F4YqoTeo9BeF+auPojrn8yxQZnp599tZZCF66g+ Ng92JZ9qKZJOQEw4EnTm2N43yspVL8NWCgquNX+lNFRXzUvkUiC+S0Lq5izqEf+fwxdp C7sKLcomlKqzerZ8QWzpQK//BWeQze3PESfGccrlZYuX6oUf/na6QSB6LbTmK8AbxI6r MD3w== X-Forwarded-Encrypted: i=1; AJvYcCWIT+sv6HlLlakNmt9D522Z2mujy6WIdqn1jUnezVvurEIkUQH9+vCRI625+q95oUe99qScwbWNQA==@kvack.org X-Gm-Message-State: AOJu0YxXX7c+TP6JkdZGzsG71S1OshCs5aal5ZruTc4PFGJ1sDRalt/k crysejCydjj/4W03iLn/oY218cEFdGm/lcJwKbCYmYPOMB9ernsE1SQGDyOyHmdQIcPQOsO7HgP 8twB6Gws1MAACmNaWAxwSsSrBVNA= X-Google-Smtp-Source: AGHT+IE6c4lVuITUWP2AnCS+3U+vimCfcb/vR/r7OmZl1XoOvvfYaZSnjW4MG7xjOGG/qEfZPhP4klY5s9xwAhVNuT8= X-Received: by 2002:a05:6122:d99:b0:4f5:27ac:ce85 with SMTP id 71dfb90a1353d-4fc6c71d87fmr16089723e0c.3.1724141494985; Tue, 20 Aug 2024 01:11:34 -0700 (PDT) MIME-Version: 1.0 References: <20240817045516.58037-1-21cnbao@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Tue, 20 Aug 2024 20:11:23 +1200 Message-ID: Subject: Re: [PATCH v5] mm: override mTHP "enabled" defaults at kernel cmdline To: David Hildenbrand Cc: akpm@linux-foundation.org, linux-mm@kvack.org, baolin.wang@linux.alibaba.com, corbet@lwn.net, ioworker0@gmail.com, linux-kernel@vger.kernel.org, ryan.roberts@arm.com, v-songbaohua@oppo.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 43AED1C0004 X-Stat-Signature: 3m94rojkswnapaaeio6otcfz7k8pywep X-Rspam-User: X-HE-Tag: 1724141496-613543 X-HE-Meta: U2FsdGVkX18jSA8xSemzkwrqIEvDgbsmn6Yj4HwXQ6Mo1YT0NzMEX5JzwPxLt04zLqH3v6ves9Yi4GyLseCsH0C+urEmlMre82sKEcqGSS6IlreqmZgefsr9THUuo44MbKcmma00xWL7VvxuvKTHUSfpyuzX/Z7I2pCeSvmscZHgsEbjKDkmVu/uVu9qEGvoYPStXsZ9pCXTJo4DN2iZaNxkFcOWMFllAlL/WSgSCRcN+WY8D85vUwBtdRw4zEaOhUGOjE0wiCa45F1j08fXvA9cBlsgL048Vqbv7IIxjzWDPzXWW6bLrVKM9AHvSFf1+0GPzpiWhbHXobzZIzWOi9L0zVAHwGQ2GRAc8oHf1WOzAqPJDm/B0WHazJ4CeY+cLzmKWr07kFdomcO+cuJNHywPp5KYWYpFGjxyRmT982a+XAMv5dwTHQFAuyBws/S2J4jRHB9DpVez8hdCBw4Xa9o+d71Fh+t/6XnDO40cGIXvmIeAsexJqyXlQV1uTUTX5jw2DISvWHGmaGP04QVYCevFsu+4cOZEnaahR2mbOrmXH4OdCq7QUlnQrPcInaTmUwN92ZBr1fIp2IHejIdaEytpCsfWjPfkJHWkVM1Q0DKas9Buwj4Ivn4I7eQMxJrKPrbHanmH+ILnHE3XywFsTHsHYhonQx5OWdw0k/oR9DdCdNNSTdpsnkQuHZlkGZb7v/Og7WsIRXp8lroagiQdGaMbGLvzv0Hal0pWeILodAayWen5r25X3XisAZ7XrMi58KzixO45x5iMyi/vHs57Pnj7hGluZVswvuCnyMTQRitUxuoHVu5mPmokN/T0RXudpPJrDVYrckMtzsvyU781ARx2HeQTKb4SZQgvVCW2gLerWqaM3QRjvXfrU3icwgyzZ+INArS4cgronG2SQJNpENcrS8qHgegUOmncEkOCIIY0nFwfre2zt7jaajxaslKOMOozK61kekGIpTp1AFA ExrAl7Nw 3rgvH2D5X3Qom8n3toO9fkDQgCx1bEMo4cAgODDhjFl1UV5Us4yKvTIUK6HSlgQy3Be42AVQjhkrtjm3dRfBVe3jS9YS9H0r5nx5mRO71kVRi15DyKojK7fJknyO9MAtlKFgl2pddABFL2S8JBjZWH2o3zqSu6xjznfnMkCvXKjNjzXTzEee2lPDkj096ig6iIugGhaKnS3YnOOcCoNtwZoydI98SQ5jqId9js4dFJsm8gs8ZF0pPQcm80y3Bsju0bS4SBaev/Y6xycaDonyM2cVDvUNXIObqY7gqh6LjGN3nIVGHSq748nn2b4T451UoqVPmEwQQ7N4eAL846W7Sms7sdoYcA9Zv7SFNPETYXAORzd79aUwtKU9c0/XsAQrqDxFKVHuwxrkWsXoxRewHVcDPipPvdgQn6By7uhYwPyXvDLWePeD9s3YrKyA375EXUqQidiuKgqtDWufuEmtafUVudHWg/U+5Qb87TLoGLaPYrXuC37q+4drEV7CXxsWEjXU+hSrQwPe/K1YRjx4OT0pSnyW6GaTYICDG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Aug 20, 2024 at 7:53=E2=80=AFPM David Hildenbrand wrote: > > On 17.08.24 06:55, Barry Song wrote: > > From: Ryan Roberts > > > > Add thp_anon=3D cmdline parameter to allow specifying the default enabl= ement > > of each supported anon THP size. The parameter accepts the following > > format and can be provided multiple times to configure each size: > > > > thp_anon=3D,[KMG]:;-[KMG]: > > > > An example: > > > > thp_anon=3D16K-64K:always;128K,512K:inherit;256K:madvise;1M-2M:never > > > > See Documentation/admin-guide/mm/transhuge.rst for more details. > > > > Configuring the defaults at boot time is useful to allow early user > > space to take advantage of mTHP before its been configured through > > sysfs. > > > > Signed-off-by: Ryan Roberts > > Co-developed-by: Barry Song > > Signed-off-by: Barry Song > > Reviewed-by: Baolin Wang > > Tested-by: Baolin Wang > > Cc: David Hildenbrand > > Cc: Jonathan Corbet > > Cc: Lance Yang > > > Acked-by: David Hildenbrand > thanks! > Some nits: > > > --- > > -v5: > > * collect Baolin's reviewed-by and tested-by tags, thanks! > > * use get_order and check size is power 2, David, Baolin; > > * use proper __initdata > > > > .../admin-guide/kernel-parameters.txt | 9 ++ > > Documentation/admin-guide/mm/transhuge.rst | 38 +++++-- > > mm/huge_memory.c | 100 +++++++++++++++++= - > > 3 files changed, 139 insertions(+), 8 deletions(-) > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Document= ation/admin-guide/kernel-parameters.txt > > index f0057bac20fb..d0d141d50638 100644 > > --- a/Documentation/admin-guide/kernel-parameters.txt > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > @@ -6629,6 +6629,15 @@ > > : poll all this frequency > > 0: no polling (default) > > > > + thp_anon=3D [KNL] > > + Format: ,[KMG]:;-[= KMG]: > > + state is one of "always", "madvise", "never" or "= inherit". > > + Can be used to control the default behavior of th= e > > s/Can be used to control/Control/ ack > > > + system with respect to anonymous transparent huge= pages. > > + Can be used multiple times for multiple anon THP = sizes. > > + See Documentation/admin-guide/mm/transhuge.rst fo= r more > > + details. > > + > > threadirqs [KNL,EARLY] > > Force threading of all interrupt handlers except = those > > marked explicitly IRQF_NO_THREAD. > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation= /admin-guide/mm/transhuge.rst > > index 60522f49178b..4468851b6ecb 100644 > > --- a/Documentation/admin-guide/mm/transhuge.rst > > +++ b/Documentation/admin-guide/mm/transhuge.rst > > @@ -284,13 +284,37 @@ that THP is shared. Exceeding the number would bl= ock the collapse:: > > > > A higher value may increase memory footprint for some workloads. > > [...] > > > unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, > > unsigned long vm_flags, > > @@ -756,7 +757,10 @@ static int __init hugepage_init_sysfs(struct kobje= ct **hugepage_kobj) > > * disable all other sizes. powerpc's PMD_ORDER isn't a compile-t= ime > > * constant so we have to do this here. > > */ > > - huge_anon_orders_inherit =3D BIT(PMD_ORDER); > > + if (!anon_orders_configured) { > > + huge_anon_orders_inherit =3D BIT(PMD_ORDER); > > + anon_orders_configured =3D true; I realized this is redundant since anon_orders_configured won't be accessed later. so i would like to also drop "anon_orders_configured =3D true" in v6. > > + } > > > > *hugepage_kobj =3D kobject_create_and_add("transparent_hugepage",= mm_kobj); > > if (unlikely(!*hugepage_kobj)) { > > @@ -941,6 +945,100 @@ static int __init setup_transparent_hugepage(char= *str) > > } > > __setup("transparent_hugepage=3D", setup_transparent_hugepage); > > > > +static inline int get_order_from_str(const char *size_str) > > +{ > > + unsigned long size; > > + char *endptr; > > + int order; > > + > > + size =3D memparse(size_str, &endptr); > > + > > + if (!is_power_of_2(size)) > > + goto err; > > + order =3D get_order(size); > > + if ((1 << order) & ~THP_ORDERS_ALL_ANON) > > Could do > > "if (BIT(order) & ~THP_ORDERS_ALL_ANON)" ack > > > + goto err; > > + > > + return order; > > +err: > > + pr_err("invalid size %s in thp_anon boot parameter\n", size_str); > > + return -EINVAL; > > +} > > + > > +static char str_dup[PAGE_SIZE] __initdata; > > +static int __init setup_thp_anon(char *str) > > +{ > > + char *token, *range, *policy, *subtoken; > > + unsigned long always, inherit, madvise; > > + char *start_size, *end_size; > > + int start, end, nr; > > + char *p; > > + > > + if (!str || strlen(str) + 1 > PAGE_SIZE) > > + goto err; > > + strcpy(str_dup, str); > > + > > + always =3D huge_anon_orders_always; > > + madvise =3D huge_anon_orders_madvise; > > + inherit =3D huge_anon_orders_inherit; > > Should we only pickup these values if "anon_orders_configured", > otherwise start with 0? Likely that's implicit right now. My point is that, initially, those values are always 0, so copying them won't cause any issues. Of course, we could add a check like if (anon_orders_configured) copy... but it doesn't seem to be a hot path by any means? in that case, i will have to initialize: unsigned long always =3D 0, inherit =3D 0, madvise =3D 0; > > > + p =3D str_dup; > > + while ((token =3D strsep(&p, ";")) !=3D NULL) { > > + range =3D strsep(&token, ":"); > > + policy =3D token; > > + > > + if (!policy) > > + goto err; > > + > > + while ((subtoken =3D strsep(&range, ",")) !=3D NULL) { > > + if (strchr(subtoken, '-')) { > > + start_size =3D strsep(&subtoken, "-"); > > + end_size =3D subtoken; > > + > > + start =3D get_order_from_str(start_size); > > + end =3D get_order_from_str(end_size); > > + } else { > > + start =3D end =3D get_order_from_str(subt= oken); > > + } > > + > > + if (start < 0 || end < 0 || start > end) > > + goto err; > > + > > + nr =3D end - start + 1; > > This only works as long as we don't have any "Holes", which is the case > right now. Right. If a gap appears in the future, I can verify that all bits from star= t to end are valid against THP_ORDERS_ALL_ANON. > > > + if (!strcmp(policy, "always")) { > > + bitmap_set(&always, start, nr); > > + bitmap_clear(&inherit, start, nr); > > + bitmap_clear(&madvise, start, nr); > > + } else if (!strcmp(policy, "madvise")) { > > + bitmap_set(&madvise, start, nr); > > + bitmap_clear(&inherit, start, nr); > > + bitmap_clear(&always, start, nr); > > + } else if (!strcmp(policy, "inherit")) { > > + bitmap_set(&inherit, start, nr); > > + bitmap_clear(&madvise, start, nr); > > + bitmap_clear(&always, start, nr); > > + } else if (!strcmp(policy, "never")) { > > + bitmap_clear(&inherit, start, nr); > > + bitmap_clear(&madvise, start, nr); > > + bitmap_clear(&always, start, nr); > > + } else { > > + pr_err("invalid policy %s in thp_anon boo= t parameter\n", policy); > > + goto err; > > + } > > + } > > + } > > + > > + huge_anon_orders_always =3D always; > > + huge_anon_orders_madvise =3D madvise; > > + huge_anon_orders_inherit =3D inherit; > > + anon_orders_configured =3D true; > > + return 1; > > + > > +err: > > + pr_warn("thp_anon=3D%s: cannot parse, ignored\n", str); > > "cannot parse, ignored" -> "error parsing string, ignoring setting" ? Ack. > > > + return 0; > > +} > > +__setup("thp_anon=3D", setup_thp_anon); > > + > > pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) > > { > > if (likely(vma->vm_flags & VM_WRITE)) > > -- > Cheers, > > David / dhildenb > Thanks Barry