From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65347C3DA7F for ; Mon, 5 Aug 2024 18:11:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CDDE16B009E; Mon, 5 Aug 2024 14:11:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C8DEA6B00A0; Mon, 5 Aug 2024 14:11:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B7CB46B00A2; Mon, 5 Aug 2024 14:11:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 9D0F46B009E for ; Mon, 5 Aug 2024 14:11:34 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 46D3E1603D2 for ; Mon, 5 Aug 2024 18:11:34 +0000 (UTC) X-FDA: 82418984508.28.B5BEC2C Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53]) by imf29.hostedemail.com (Postfix) with ESMTP id 699E012002D for ; Mon, 5 Aug 2024 18:11:31 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0o5zuQlX; spf=pass (imf29.hostedemail.com: domain of jeffxu@google.com designates 209.85.208.53 as permitted sender) smtp.mailfrom=jeffxu@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722881461; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1hnQGBE87+KnmPa4Bs70BMJXJl5MSqY0q2M1s2nGpdw=; b=bU6AVxUXysD0I+hVsHBHrHuNKfhzhXUFVAcrqfToIVrWeiFTJKqaUh+BW6z8htWu0MJ7so s6H5+Ebrgd60Czq8q7HOCxhV13rb1aRO91iPKxR27a65uf/VjvkQYIgP3GZ61a52JmGpYx LK/1oso25HY/h5qctSbbtoIEDmtOc/I= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0o5zuQlX; spf=pass (imf29.hostedemail.com: domain of jeffxu@google.com designates 209.85.208.53 as permitted sender) smtp.mailfrom=jeffxu@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722881461; a=rsa-sha256; cv=none; b=HWYvpxot6Ry7DLXM/P9GhsAKLTsFUXJH2hJqfSpuZZP1Jj8z2Kgl+6TFDWKgIy+er/ZUJK +sEdDMFR4NPstvK//Xp3v3a+hq1xw8hKG7VJvRUU7mkSewiV0AVWj057VECMbnAAeB24Fw 0mLrq3EmmNznbCsrvkdCStXn3WfLCI8= Received: by mail-ed1-f53.google.com with SMTP id 4fb4d7f45d1cf-5a28b61b880so2262a12.1 for ; Mon, 05 Aug 2024 11:11:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722881490; x=1723486290; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=1hnQGBE87+KnmPa4Bs70BMJXJl5MSqY0q2M1s2nGpdw=; b=0o5zuQlXSHe+nFxzHrrIFEO1pX2mT0w2AQU1VbItbnOfGm+PhmKmQc4TpZgrU1mHE0 haIf0obrd+QjQhwgjPeARcn/Blx9l0jaqanDHj48hSVd1BNf6GzqJdbJO+iHhp730tda II4aG04ZXP639mNJnnllfSsoRsSqIVbxG35kH+mRuxTHFCJ+ZrqdYomKV6X2dAHTfDNq ootYaf9zX/wgAOATnw/Wxc0m14JduFj254X/73mu/uTp8amkS84XOqXEc0kSXI10J/EC c9cml8r9LLp+3ecPPz3mblJEdSExJY8ApF1fwfP1VCDK0w2pekkb37S7vwvJ7GbD+jTu WfvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722881490; x=1723486290; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1hnQGBE87+KnmPa4Bs70BMJXJl5MSqY0q2M1s2nGpdw=; b=xLhC43Y4auJA7Jiu8TREu1u6g4Wii4eAQIK3fGy9vlBJtX2ssMJuE5yzEGEw8rG0GQ o50Wb+jOVGSAnlLtTPlHeRTH3Jh4mUTJXcVV2ciyWrcoUuSPyOcbqzkIq0cKq9R3pMYT Xq5kr8XOou5e6cZnZqvp50eQYYZw6CJ3taGF36PaVE+Rl9hveTY/z6z8TmMpERnaeH99 eb5FvnX+Ecn6uzYTpgBgKVBYiWRpUp0yshKKc+C/fGOsoxBxASmfbAxOGnuFxJ2uS5ml xNI7oXbaggCuEId3vCinLu0LTtrsxgbfiGEQTknc+YrFBs+EHWAk2iM5J1DL+VGIYGMI r21Q== X-Forwarded-Encrypted: i=1; AJvYcCXC1ja6tE72lwv+ZEfhSyQSBqlLfRbu93nJ5xvesw67RpReVkQn0g449Rar+87P/tRFW7nqNg+Hm4K9DVIvPwsrnyY= X-Gm-Message-State: AOJu0Yz5CHRXc0Z3QvYjwysynO3FVTQk6zfO/vLeJeVSgvlxF97emOQ7 pH1HpCnAGjxb7TvWXCNIPgCxt+DEhNpT93etSS+pCt8DRcxJ9fwr+28vvBIDwSnxrBjOVqikrIS UA0VTceuN+54lVNbtVOB/ZoWLkQ3sqGStu7DA X-Google-Smtp-Source: AGHT+IHOh15hiLMYv55kABw3lnkiR4f9+1YSJisRrpIM2LvSrRhIYzH6Y0g9dC77hX8fJN2A9rRSSiJUQoJUICoipCI= X-Received: by 2002:a05:6402:35cd:b0:5aa:19b1:ffc7 with SMTP id 4fb4d7f45d1cf-5bb98174dddmr8049a12.2.1722881489512; Mon, 05 Aug 2024 11:11:29 -0700 (PDT) MIME-Version: 1.0 References: <202408041602.caa0372-oliver.sang@intel.com> In-Reply-To: From: Jeff Xu Date: Mon, 5 Aug 2024 11:10:52 -0700 Message-ID: Subject: Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression To: Pedro Falcato Cc: Linus Torvalds , kernel test robot , Jeff Xu , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , Kees Cook , "Liam R. Howlett" , Dave Hansen , Greg Kroah-Hartman , Guenter Roeck , Jann Horn , Jonathan Corbet , Jorge Lucangeli Obes , Matthew Wilcox , Muhammad Usama Anjum , =?UTF-8?Q?Stephen_R=C3=B6ttger?= , Suren Baghdasaryan , Amer Al Shanawany , Javier Carrasco , Shuah Khan , linux-api@vger.kernel.org, linux-mm@kvack.org, ying.huang@intel.com, feng.tang@intel.com, fengwei.yin@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 699E012002D X-Stat-Signature: atphxyyqhoshs6cnr16dey8wdqecqr5f X-HE-Tag: 1722881491-343923 X-HE-Meta: U2FsdGVkX1+jP68c6vAdoNgogGCScQYfDL0r3wqvjJpdvseZKnAfYFPcFER2iYtIq7qrwSKcZ+QKlUtwhgcGJ512LOpefv3aPMbKWuFvR8dJKDFL5iCZ3vKH1Iean5YnIKCdKzIZVMovBU/6jw3dckjdKJC8pzPXnbYM3zydnVqUWyoHIXulkE05FlcbxIlx1GzjeJX96JysK6eEMLI7Xz8DY0ZL7d1cptFA6dJVWMqn5m5J9fvg1YyQZwKHZkm2dcMVOmm6cFqVsNvbf+slz3bEG/zdlBbAsi/qB0n0AoA/f6jldVVECfWH+cog/ekFFq4/EnZZoIAlgpVyuXFaqGMr1QqHBQ+CRFFzWL46gRbvIb2fdDmWMzfTjwbQyxNbmQg8xM2dgbwbiAXg0d1zdAiwVCIAuDFrGrbDelATKq6NzAHRFln87sE5WufCJ8okdgAYTx7FGJSchTpELm5KzLD+9K/qMstEffVGGkhH3Iuu8zEncATrAK9SY8vL78pnNQFZm/LEaAhQDx6wVYkptjTO8w4jkBfteIH9+EkmygMYDB8MThCXBiGdx+sxBDylxwClco2v7fDXLD2JndsBgc925Urjuynqv/pOb6/vbL91JE486P6U9zUzL03aN7lsmH2lA7Q6niivpJ/S51LKG0is/GCPV8urH71CG8rsn5lkuETTXsYkdWMJSZamAHtkXaiV6XRtyZ5zDjSLI6tYDUYxYA3x3oauIFOysISVNNY6zUjIS1WKFMHSCl+IXEaLEB0ARfjCO55LTT3JMMncH72xQPRNJyHK0RK5yYEUrxaeGHCUPJjwZcJobmRo82/lweYknROZ69kOPgyY72PzIee3dz4Wn+/E1E/tmmVAzca7JRnSSKNhKEvm+14yAx2sKXB2ODuNbo/bLFe1/nGbHlpxAK8unENK7aFrk+b19zIKwETRB368nGGGWEQELjEDrRpDFqZVFuGK+3Vycqz GScjwNIW 552dBqMLZtKoRKwnvpZg1egsDQSN/QVvvZ5ro+gBAtE3X1bl8ygJ4GXfiDtLdbg4o5dkNOM+cSfTQLfstA+mZXbB5Y+gN9iChG7o3lbjsuCL002T5a0JZkuQNbwsYK0Z13+R0C7UzToQ5YNgpNj/oiGFvekib01QAOQ/AcE/sVYtNoHd1wP6fx8qp5mWOVdYd4FivDwudQjamkzfsJWoFiERQqywc4VuYDaU6s1pQIrtmDjoCVfAJMfVmIS2toLGmvNpMaJpSz+M1xZYFrgJAOtSErccg2NFtxypDr1Oba4kptQrR0mXXy5WSrl2VBzwUl75kHjW0qgGs+mUO4VpLAClFklwUxSwqO5MI+mVicC6MZiuLRvUenHGC5Bo6B3893HoOj64KjZIy3d4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 5, 2024 at 6:33=E2=80=AFAM Pedro Falcato wrote: > > On Sun, Aug 4, 2024 at 9:33=E2=80=AFPM Linus Torvalds > wrote: > > > > On Sun, 4 Aug 2024 at 01:59, kernel test robot = wrote: > > > > > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.pa= ge_remaps_per_sec on > > > commit 8be7258aad44 ("mseal: add mseal syscall") > > > > Ok, it's basically just the vma walk in can_modify_mm(): > > > > > 1.06 +0.1 1.18 perf-profile.self.cycle= s-pp.mas_next_slot > > > 1.50 +0.5 1.97 perf-profile.self.cycle= s-pp.mas_find > > > 0.00 +1.4 1.35 perf-profile.self.cycle= s-pp.can_modify_mm > > > 3.13 +2.0 5.13 perf-profile.self.cycle= s-pp.mas_walk > > > > and looks like it's two different pathways. We have __do_sys_mremap -> > > mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the > > destination mapping, but we also have mremap_to() calling > > can_modify_mm() directly for the source mapping. > > > > And then do_vmi_munmap() will do it's *own* vma_find() after having > > done arch_unmap(). > > > > And do_munmap() will obviously do its own vma lookup as part of > > calling vma_to_resize(). > > > > So it looks like a large portion of this regression is because the > > mseal addition just ends up walking the vma list way too much. > > Can we rollback the upfront checks "funny business" and just call > can_modify_vma directly in relevant places? I still don't believe in > the partial mprotect/munmap "security risks" that were stated in the > mseal thread (and these operations can already fail for many other > reasons than mseal) :) > In-place check and extra loop, implemented properly, will both prevent changing to the sealed memory. However, extra loop will make attacker difficult to call munmap(0, random large-size), because if one of vma in the range is sealed, the whole operation will be no-op. > I don't mind taking a look myself, just want to make sure I'm not > stepping on anyone's toes here. > One thing that you can't walk around is that can_modify_mm must be called prior to arch_unmap, that means in-place check for the munmap is not possible. ( There are recent patch / refactor by Liam R. Howlett in this area, but I am not sure if this restriction is removed) > -- > Pedro