From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3148CEE49A4 for ; Tue, 22 Aug 2023 14:25:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A91A280039; Tue, 22 Aug 2023 10:25:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 658BE94001B; Tue, 22 Aug 2023 10:25:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F937280039; Tue, 22 Aug 2023 10:25:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3C96B94001B for ; Tue, 22 Aug 2023 10:25:00 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 14795B1C71 for ; Tue, 22 Aug 2023 14:25:00 +0000 (UTC) X-FDA: 81151962360.24.545CE48 Received: from mail-oo1-f49.google.com (mail-oo1-f49.google.com [209.85.161.49]) by imf11.hostedemail.com (Postfix) with ESMTP id 4518140019 for ; Tue, 22 Aug 2023 14:24:58 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="KmUR/h58"; spf=pass (imf11.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.161.49 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692714298; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OFGMF6bcGMZZC92fwyfQGf3wf3ZewzE1L+corQ9L5+8=; b=AcoBDz14eAM6w0Wp5z2l9P0diKr0dOXs1GtxsSJxdV7SGibO4OrrgZE0aeaFpHyux88ecY yHMUcOMGWgxZX+zMjhitwPGK9slhtH+xrPZdHfBpf94gVlP0GvN2y4EdRorwv+S3HCuHFI SHDjVXco2nhd5khI4dqJVfz8OrwqTzw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692714298; a=rsa-sha256; cv=none; b=NIRmtiixn1RsYx/f36zoK/QoQyR6KLGUoKennxHJUFp46q4ytl1iSDWzZit8Zr0/tBSU79 YSZWSS130lV5/nBg5+MCXpVKdqbO1yQZ+C0O+jJPYm+MSQpK187pdW3RyMvvYCc+fXAFiM QvslrRc46ulHJ1G/GKhE9jCy7coEsU4= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="KmUR/h58"; spf=pass (imf11.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.161.49 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-oo1-f49.google.com with SMTP id 006d021491bc7-56c4c4e822eso2916716eaf.3 for ; Tue, 22 Aug 2023 07:24:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692714297; x=1693319097; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=OFGMF6bcGMZZC92fwyfQGf3wf3ZewzE1L+corQ9L5+8=; b=KmUR/h58wS4CPwF7dTTAoEmoVh4ntgAg+5zevPT+H6hksStT4T2nZiAX8xTSflK3g9 aAP7544VCUb7ywvL7MvKqMZJ/0wlCaqDw8npWG1ZcLMObqAjrMyFowrSa295LNopvaI3 b/0Ybup1LtGa24cif9uVt1AmaL04v4BmKxuz6iI4ZaYPVFa82r1z9XMdGKQ2Uj/cxtwb soiM6TE2vOfPeHwoVvz3bMKy92irygehXg8/4b5hn8Rv8y1n1RzlUpcq6QxCKyLnBnTs zd2+fq/ac/0ksRhsi4YWsPqSYEbKVZS+c6hcZQYfbRoNc/xwRlw2/SQsa8nCKyenFOnz rg7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692714297; x=1693319097; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OFGMF6bcGMZZC92fwyfQGf3wf3ZewzE1L+corQ9L5+8=; b=ejwobq9zDtZ+qUfw8oB30U3h3Y/Yp5CGrJS/sSNwmbcygjAwSviI+c2zPNfoIY4Xkg nNM37N47Gav5tzMn7cFPBl9/kYmXm3ajmsp7kB7o7KVuawkikBlAZhhvHsLkKT3eJrVB IQ9cDdNyyb6Zcx6HuSnsoCd3sYF7EZhWiOA92L5dSFTjRlionozmXRadUkcSWYDpWBcG c/0ABySc7WQ1HxFgSXT0lM3hxB88D5ka6uopB4PVWsy307x4Tmkvxu5Slawl4xUVJAgf 1J9DwJnnriEo1LAKSmWX0qxlAIjDfW/gQsQCbuvBNNj+2MZbN2U/EC3x0N1iuNOL6x6e jy4w== X-Gm-Message-State: AOJu0YxO/Lq+SVkoul4HDzgmB3VQBBKi1qUtdhyxB6sk9ejCzouBG3PE SLGYZ8EHSFKc7s31wXcYKfNuWuZ5phqpogDsaD4= X-Google-Smtp-Source: AGHT+IFkxxDZfD4EH7dcr8h0nrHXTNxvBsjY/gDCbkvc9Bo2D3WpqIrtMXQJvz04yKHBxfqwav8CsP8f9t3QB73Y3A4= X-Received: by 2002:a4a:dfb8:0:b0:56e:68a8:7f5d with SMTP id k24-20020a4adfb8000000b0056e68a87f5dmr10933306ook.3.1692714297342; Tue, 22 Aug 2023 07:24:57 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac9:6647:0:b0:4f0:1250:dd51 with HTTP; Tue, 22 Aug 2023 07:24:56 -0700 (PDT) In-Reply-To: <20230822095154.7cr5ofogw552z3jk@quack3> References: <20230821202829.2163744-1-mjguzik@gmail.com> <20230821213951.bx3yyqh7omdvpyae@f> <20230822095154.7cr5ofogw552z3jk@quack3> From: Mateusz Guzik Date: Tue, 22 Aug 2023 16:24:56 +0200 Message-ID: Subject: Re: [PATCH 0/2] execve scalability issues, part 1 To: Jan Kara Cc: Dennis Zhou , linux-kernel@vger.kernel.org, tj@kernel.org, cl@linux.com, akpm@linux-foundation.org, shakeelb@google.com, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4518140019 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: q6yiz1wgnq7rbs5n8jmxcm4h37hr3kof X-HE-Tag: 1692714298-106326 X-HE-Meta: U2FsdGVkX18VJ12Yz9BPvXFRdjd6sn+H0ts/RZvqsYsUs9tPPR1fugfp6N2NlQ+NEWq7/gVvxX7C2SPtZkK5mgUHWvd6fnBpxFKKQwBja5WaRDXVvy2SoXeZ5ZSbAQhyxhIqqUC0pY9+kYMbchLwiMvH514OfKT9fDo0oPLm4p1f3QuAKaaWCLvQkvi+Xl0uAmQXqQD3u6kcN3pNUC8nb3DmNqgMY1+vg83qrsbcsQbdL0X7luhQESeiyGzjjS+e4xM6PkB2Ik2eAE590ZN8EhZAn1OvC8OdPwwOcRtc9dGESiyMF89RkU6Ewm/GQp4i+18I0qPB2PMSLwEu77QfPE0RgqMVUXAU4k1kg0Ewdt1FKxvo6WQu/O/ttnm+IZi2BzcHvC410et+rgUDciHuyAF5m/pgNYx9Fax2U3YjeF+1ShPU+iGZpeC9jhufeqOrwCvec1J7Uoe9te0n03tWVNrY4w6c1Ptt8XFyrx55JXqCoDQvTYIogQzp+fSFy9eF4WRhC3Slb1Wvd7AfvuYCT+evkzu21K89FlJpkvB+V0PMTKxtYfXoHEpSMB2tJGcetpP/ysVce3HztpeQO3aRGDkyXEAdYzvCk11JM1I74fqVuG+xUsApLMpY5bYHp0OGLUs6mm1JPFug7Xt8lM21i6ZMeh/FtPTYmGl0SkhLKFKD7JnewikJPyABanOLFs2itnm6FidWqjhcE/Fa5mSJDb8v3FWC+1DSQONMTI2KG5xylFt+tWui6nmcjztB+cYymioVYhetrnbHxeboTgBcrzxwf23UqAT8RyIbSePwGFyfoJb2dGwidTUWJaePRujmdSHLGnr0q52ugzHW/99zODYKzut8gUIoASd/+d9fm1xk8oMdOtfx7hyWBZmwCJ7ZyG1BaqTzCCM4M2tsEaFEzw7TLRnVwEdv9uqYT1q7UEWLF4z2iztEmfscPaavEH0e1PPq/DeHOeuP/eShS6k PVuuivaC +CSdF+U/cr5W2kOg5NBvvBzxD4cBCvs/srSf4LP2xazKwRCBwLg9atr0zuOvb5nieDcTBxg+gu7kXksMVUJnI1QIsGaA4xcxwia2/TXOVKJRz3eqKiZu2VLvagnGnPtWaJmgXVQ/qzZWaE+dTp0DtTxK7L+EMQp91WX6+7Y5r6EWnDzRtgyHx0MZfvgDAl3hHOj7fs8ewXBtNoqZz4VoYtrHfKp2nV3S2w+uylA6cZoEcqeUrWkOHKfI1+sanf95xKQ1f0LfDdORMB91cZevRSEXNe6OZRKi4FdYoCN+dPWjzEOsktx9Un1QmHUSrgPrYY2CGL1HJ0XVhhhoC1JoqxNp1ZIP0k3RxUzgNz9iFAeqkOEA6BWpCGqQJCfw5UsSvSmBmbuWZ6Jh3xQpLw7ggkIvp+W9Rctrny1gcMRV56fatzdyjqWvKrDZaaw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 8/22/23, Jan Kara wrote: > On Tue 22-08-23 00:29:49, Mateusz Guzik wrote: >> On 8/21/23, Mateusz Guzik wrote: >> > True Fix(tm) is a longer story. >> > >> > Maybe let's sort out this patchset first, whichever way. :) >> > >> >> So I found the discussion around the original patch with a perf >> regression report. >> >> https://lore.kernel.org/linux-mm/20230608111408.s2minsenlcjow7q3@quack3/ >> >> The reporter suggests dodging the problem by only allocating per-cpu >> counters when the process is going multithreaded. Given that there is >> still plenty of forever single-threaded procs out there I think that's >> does sound like a great plan regardless of what happens with this >> patchset. >> >> Almost all access is already done using dedicated routines, so this >> should be an afternoon churn to sort out, unless I missed a >> showstopper. (maybe there is no good place to stuff a flag/whatever >> other indicator about the state of counters?) >> >> That said I'll look into it some time this or next week. > > Good, just let me know how it went, I also wanted to start looking into > this to come up with some concrete patches :). What I had in mind was that > we could use 'counters == NULL' as an indication that the counter is still > in 'single counter mode'. > In the current state there are only pointers to counters in mm_struct and there is no storage for them in task_struct. So I don't think merely null-checking the per-cpu stuff is going to cut it -- where should the single-threaded counters land? Bonus problem, non-current can modify these counters and this needs to be safe against current playing with them at the same time. (and it would be a shame to require current to use atomic on them) That said, my initial proposal adds a union: diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 5e74ce4a28cd..ea70f0c08286 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -737,7 +737,11 @@ struct mm_struct { unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for /proc/PID/auxv */ - struct percpu_counter rss_stat[NR_MM_COUNTERS]; + union { + struct percpu_counter rss_stat[NR_MM_COUNTERS]; + u64 *rss_stat_single; + }; + bool magic_flag_stuffed_elsewhere; struct linux_binfmt *binfmt; Then for single-threaded case an area is allocated for NR_MM_COUNTERS countes * 2 -- first set updated without any synchro by current thread. Second set only to be modified by others and protected with mm->arg_lock. The lock protects remote access to the union to begin with. Transition to per-CPU operation sets the magic flag (there is plenty of spare space in mm_struct, I'll find a good home for it without growing the struct). It would be a one-way street -- a process which gets a bunch of threads and goes back to one stays with per-CPU. Then you get the true value of something by adding both counters. arg_lock is sparingly used, so remote ops are not expected to contend with anything. In fact their cost is going to go down since percpu summation takes a spinlock which also disables interrupts. Local ops should be about the same in cost as they are right now. I might have missed some detail in the above description, but I think the approach is decent. -- Mateusz Guzik