From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C6BCC3DA64 for ; Thu, 1 Aug 2024 13:37:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E145B6B00BD; Thu, 1 Aug 2024 09:37:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D9D176B00BF; Thu, 1 Aug 2024 09:37:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C19246B00C1; Thu, 1 Aug 2024 09:37:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9BDF56B00BD for ; Thu, 1 Aug 2024 09:37:42 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 532BBA059C for ; Thu, 1 Aug 2024 13:37:42 +0000 (UTC) X-FDA: 82403779164.03.ED5491A Received: from mail-ej1-f52.google.com (mail-ej1-f52.google.com [209.85.218.52]) by imf10.hostedemail.com (Postfix) with ESMTP id 6EFCDC0004 for ; Thu, 1 Aug 2024 13:37:40 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kUVEJcEQ; spf=pass (imf10.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.218.52 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722519415; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2MJGq9K50tWQqmXkPFfvIuTp2wsCjmWRK4EiU3yCi8g=; b=r0R2jdU8x5953MbV4V/Hzv/LUQTHXWvlI7+CPvfJzw6+QPd0DPdBsIMYPOvE42mp+VzeKi KIqzhT1ATT0M+9yfJdx2GqSDeyzAW47tWgmusAD/r1buoooyXPmkmVhsLH9getMx6H2j5F NQcFE4IWORq2ma7vyrTA1+cIY6it6is= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kUVEJcEQ; spf=pass (imf10.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.218.52 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722519415; a=rsa-sha256; cv=none; b=lQKotVzy1mBy9BIApAabc48Td6hIZ/xqdtxkjKRJpCubiiAAp6T9u+u1PZWE7jNYna4TZV uM2cjh60DziODBr4nj54Zk18LTv2lp8/KG8re0wvX1/l55PucpU/ofOQO6Pm/RaLHc4Vul ultrzN3CFP51xbCJLOkZAOIWg6K3s/0= Received: by mail-ej1-f52.google.com with SMTP id a640c23a62f3a-a7d2a9a23d9so851457466b.3 for ; Thu, 01 Aug 2024 06:37:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722519459; x=1723124259; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2MJGq9K50tWQqmXkPFfvIuTp2wsCjmWRK4EiU3yCi8g=; b=kUVEJcEQAyNqy3v/Lgx1VLgiJTkiHmv9mkAEz+roAV68IIKCere+v3/7xnnNetx+8R //FeJtIN30h3IsUm6/+6hN9g3VbTxFePT9Uan4X9rPTWEx1oPbAqrBGVlZPg8GR2NwGN OX/7OP59H0ReQHsL9m6LbLZkQRyldQ4ClZ0731QkaMxsRHlYcmDP2VYohI8HOyIcP8Rc 0XrFM3M8VGMwS0CIL9RUZSDcKkuAFxsUPnImlvOl7db35W8dhMR60pVJc742R6EzO8R4 JARgUCzmUIxu/gBfgLV3JmFmw4y1gb1hdVyFHv0DymZw/hwr4f0+ga8WDkN30wcy+VR2 0P4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722519459; x=1723124259; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2MJGq9K50tWQqmXkPFfvIuTp2wsCjmWRK4EiU3yCi8g=; b=Yip2h//IrVtc3pAaOIIIMOjt1VeleePPTcvmKvvR9i/ek7Afy1DCueRYQ+d8VI3u8k tFxjKiqMWwJHU3lVV8dq8fwssnjTXPzs4PtwzY52kgwBb5tZbFEixQquu6jC4WulLlYA NIG9GPjaiYnFQzgDDyuk9x/y+r+HjRzkOJOIi94XQD2F3PVOH6m07gQzSKj6B34nMMnK 9IcOpWF4DzUtPTlIXQKdjKRf0XSvcqb27C3wVBpEnjpWJrx5/z4Baqws/xsBUzDw07kw veJoWThz7YPTC/moUX3pw85h4RMSG8PYUPjkJ6XwFqsQOP7VAa8AvprRxZ2IHqClCG2o 22vw== X-Forwarded-Encrypted: i=1; AJvYcCWZAM+UNhf3Z9mhaUddBq+UT9Szgv22NkEpNjU43HR1vdpdD4wS788+v6XQkz9bS6ICgMIU/Eb8BtZg1/86zz0lrz8= X-Gm-Message-State: AOJu0Yw98oYJC6lrFwoPJxQm6QQ6VkWxm0ZvrzXdTtgo/daHeNhnJbeN +mV/0IkElVXHdZASVOxfCOPdI8z86CKoeOmgZeP25q1elOkEltVwzRv+05ROlUmBcxfYppEoK+t XhqatnRojs5hae6OBORQEBBhzSGw= X-Google-Smtp-Source: AGHT+IFvvHRv7PGT4EkPAH/+ZprZ5eQncaACyC1qBnx9bkqUCl/jQLddYQKFDWzGGuy8yHz0oezB7mTPpyzOtIZidgg= X-Received: by 2002:a17:907:1c16:b0:a7a:a06b:eec9 with SMTP id a640c23a62f3a-a7dc4db9ef1mr15331666b.4.1722519458457; Thu, 01 Aug 2024 06:37:38 -0700 (PDT) MIME-Version: 1.0 References: <202407301049.5051dc19-oliver.sang@intel.com> <193e302c-4401-4756-a552-9f1e07ecedcf@redhat.com> <439265d8-e71e-41db-8a46-55366fdd334e@intel.com> <90477952-fde2-41d7-8ff4-2102c45e341d@redhat.com> <6uxnuf2gysgabyai2r77xrqegb7t7cc2dlzjz6upwsgwrnfk3x@cjj6on3wqm4x> <5a67c103-1d9d-440d-8bed-bbfa7d3ecf71@redhat.com> In-Reply-To: <5a67c103-1d9d-440d-8bed-bbfa7d3ecf71@redhat.com> From: Mateusz Guzik Date: Thu, 1 Aug 2024 15:37:26 +0200 Message-ID: Subject: Re: [linus:master] [mm] c0bff412e6: stress-ng.clone.ops_per_sec -2.9% regression To: David Hildenbrand Cc: "Yin, Fengwei" , kernel test robot , Peter Xu , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , Huacai Chen , Jason Gunthorpe , Matthew Wilcox , Nathan Chancellor , Ryan Roberts , WANG Xuerui , linux-mm@kvack.org, ying.huang@intel.com, feng.tang@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 6qy7ud9ywttzc43rgj9n5edj5qy4dhiq X-Rspam-User: X-Rspamd-Queue-Id: 6EFCDC0004 X-Rspamd-Server: rspam02 X-HE-Tag: 1722519460-333152 X-HE-Meta: U2FsdGVkX1890OkOfhHfs3pjj45I6WBrCy8cdJXFEN/GiW+s6zMlrdVEbQi5Am5qnoTpLhrUIAyvwcGJZmpL6X0X3ONAyDAwI/GsnYmz2SUTbbxYtwkuE4w/og+4wAcOJt8FUOgtbi/NZFwG9KhRpDLIajVKCrRSev7uNnQIYM+8AK7dvaCtL3G6Wr1UDg3kTl9Yz2/BcGH+qWEA/jrQBtpqmy0zJI+EqUL7FuN1/o7vK7PeHLqcQB6TyWnlmtaQkxw/ovXaB4IDCkrp8utjx9z4xQRG3GEA1n6zWAlyjP8PZeFPNOVR6tdY+n05/9K8NsQkUKcfNafWdBKe0+NAqyPBStPBg2wzX//1ZMoqyiq+B5m1fnWZNCI+xMKOdqYGv3arNqMJ0/n4bWAEVO23id+bebM7xwfvUAvJ3oPizJm+o0hRoSuBO/vJHT5luwixNQPIdk1JtZ9R9h1kkjCePzWQfHifxDuicvy6bLhpCl42pPPyqggLPF2rmm9swW6lMJzjqHa9vHqSQBVTOkkUvvdDcLa1f+kq0Uuen7AF7F3uXHSMnjO4EXbrWJl3XaQFGkL3j8PzUQRpU7VXJEHRsWloOMZ8qwgTSosQr1DMN+WmwQG660mhMXpSRS9gEj8kwVIVggJBtCIldYsUXAmJ/MiGfS5118PrCa6L2Ww/soXHQ5LKjzTFXsJX0vjaycICXD4t9f/4n+PU+M+dCe3l4eYfnQDaPY6z5PnNUzbXYqq9sXCB8SeDt0EgfRHVAVzX/71JBUR+2fFvsxcvxT30rHu6+E9nxmg/ilayqEt2K/PDhv6alpaThWUTMvXlufC0fdSXEp0NDboDfaQWmyFBGyTwvP3oh+yxKkm2lo6CW7kARaBhoxMctrI4cYSqbA0DYBvguoD34KOFItHeJwWxygyrvWijiVe/etVpauo0BSnzwt6slSUSa/JLdgytnkyOlfSFvZdfDUKEDekaNha PliYbwVX 8HNgnaD0obRfb3p0hTd+eWMM02tm6ImXJWipGPUWAAtbFqfbJxhP/uYKSarnJ+wpPTMzX4/lr32EFXIah1ZZXB6mtYwyJWJm4jd8yj8qo8runEkhBoqz6t2dBMZoOUKGcsmD12rnfXONnBkXkQDqi0CzKeRm3iYRRZz2VET30zEF0spiQT6nbdjophrLSIzrVH9WjXt2w9w2Csvkhfj0liPSX5cu256mz0D2lbArdn+8B2myfe4KiJy/f6jE0gkAZ7vwBXISMs8akKHOlkSk++x7xhBCUFb0VkDJj3Enqn5u0gnCDgh5/BCUXOkaD9sKlz7n6tMNSOpvZNJIikz7YzpV0izDk5pu0eewFaaW2mXXhaVwI6YNaUG6LaKznnJ12JALPDf04N5QUMqA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 1, 2024 at 3:34=E2=80=AFPM David Hildenbrand = wrote: > > On 01.08.24 15:30, Mateusz Guzik wrote: > > On Thu, Aug 01, 2024 at 08:49:27AM +0200, David Hildenbrand wrote: > >> Yes indeed. fork() can be extremely sensitive to each added instructio= n. > >> > >> I even pointed out to Peter why I didn't add the PageHuge check in the= re > >> originally [1]. > >> > >> "Well, and I didn't want to have runtime-hugetlb checks in > >> PageAnonExclusive code called on certainly-not-hugetlb code paths." > >> > >> > >> We now have to do a page_folio(page) and then test for hugetlb. > >> > >> return folio_test_hugetlb(page_folio(page)); > >> > >> Nowadays, folio_test_hugetlb() will be faster than at c0bff412e6 times= , so > >> maybe at least part of the overhead is gone. > >> > > > > I'll note page_folio expands to a call to _compound_head. > > > > While _compound_head is declared as an inline, it ends up being big > > enough that the compiler decides to emit a real function instead and > > real func calls are not particularly cheap. > > > > I had a brief look with a profiler myself and for single-threaded usage > > the func is quite high up there, while it manages to get out with the > > first branch -- that is to say there is definitely performance lost for > > having a func call instead of an inlined branch. > > > > The routine is deinlined because of a call to page_fixed_fake_head, > > which itself is annotated with always_inline. > > > > This is of course patchable with minor shoveling. > > > > I did not go for it because stress-ng results were too unstable for me > > to confidently state win/loss. > > > > But should you want to whack the regression, this is what I would look > > into. > > > > This might improve it, at least for small folios I guess: > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index 5769fe6e4950..7796ae116018 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -1086,7 +1086,7 @@ PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc) > */ > static inline bool PageHuge(const struct page *page) > { > - return folio_test_hugetlb(page_folio(page)); > + return PageCompound(page) && folio_test_hugetlb(page_folio(page))= ; > } > > /* > > > We would avoid the function call for small folios. > why not massage _compound_head back to an inlineable form instead? for all i know you may even register a small win in total --=20 Mateusz Guzik