From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CAB1C3DA4A for ; Mon, 5 Aug 2024 17:55:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8DE876B00A4; Mon, 5 Aug 2024 13:55:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 88D776B00A6; Mon, 5 Aug 2024 13:55:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 754C56B00A8; Mon, 5 Aug 2024 13:55:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 570316B00A4 for ; Mon, 5 Aug 2024 13:55:00 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id F0F0B141D7D for ; Mon, 5 Aug 2024 17:54:59 +0000 (UTC) X-FDA: 82418942718.21.E4FF55D Received: from mail-oa1-f48.google.com (mail-oa1-f48.google.com [209.85.160.48]) by imf18.hostedemail.com (Postfix) with ESMTP id 2CF481C000E for ; Mon, 5 Aug 2024 17:54:56 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=meCqMBx7; spf=pass (imf18.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.48 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722880489; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aI072rJ95IpNDgMCOsSYpX0fqCtQK3FmN4K3CCKoTUo=; b=IZEey77JSZiTgO8UJEZ/xME2iDfxB3nGJJ29fxCvfmoKiJSj+SVFIqnuSSQgQ0/xBYA0im mYi1zWGF6ZUpMESpXti6qQ65vu6+/543R9juM6ZPGRndZNo3ye5TKiEXZ8X2rqzjv7zssq V1FniMPtPjw0iJiGHgAMtWQpZpEwnyQ= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=meCqMBx7; spf=pass (imf18.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.48 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722880489; a=rsa-sha256; cv=none; b=nh0A5e4VZR3ntcMJY15f4Q0vZ+a7/2uEjf0O0J9bTXlW1t+0aIZaI3Zo4FeIJYfQYZ2pin HH/Fw0wJtKx95DWZaGlDAOLRvBdetWIfe4sGGhj0Mp+VL1W4HHwlhor0CinYcfuBUishr9 9C7YbDBd+z7aFB1pPFwqUFN7iQAkfc4= Received: by mail-oa1-f48.google.com with SMTP id 586e51a60fabf-260f033fda3so6643051fac.3 for ; Mon, 05 Aug 2024 10:54:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1722880496; x=1723485296; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=aI072rJ95IpNDgMCOsSYpX0fqCtQK3FmN4K3CCKoTUo=; b=meCqMBx7cbuyT7Ns0nJ/TGQcXDgchu6jw4/1Or4pPXNhHOiJFI46xL7PTJAn1nxL7e 4To3AOFLFXyyr1QqD/N+9/y1ZnD4AjJMEFMhdFIh5PqLdoHNjY6hbXaSM9nYpm9i9Aim FJxzO3sZeLQw6Zo58xz2vE4qBBLyTD9gKZ5rA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722880496; x=1723485296; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aI072rJ95IpNDgMCOsSYpX0fqCtQK3FmN4K3CCKoTUo=; b=a0AMQuRje29JpliAttgwQ3gRTzpehkfExliErv8wmHsRodsAtJkKb4lIjMzcYiseiC CvI7XMf0A8LB9MDo3cxhhDCE2xca1gKh/jQSwbXh1J56OBzMqXScPnQpuJVrK8n8+hU8 t8skG7qJxBL1oWcEWrDwaBZvA3Acq3sSrvzSmYoVd3TMwkxk5iTA2SfW+lbE3Tj5XL8j k8GaT7RwHCan5+4cD/gsVUeqYTKNQnCTxj0YQWuQRqzOecUWYBp0xew3RmowuUkyeFfC S9X5slJLvgDWwQ8SxA5oe9Fvi3l1kTi7j2xXw6AU23pIwRfDyPD9Q6diECQ4fjMHg7pW rKTw== X-Forwarded-Encrypted: i=1; AJvYcCVrWydZ83q9XbmX/ldaAmgwaf8qztgsgwzwzAgfePqFsPjkw7ua/4h3uNByzt2NL/Yh/4DPC+2SpDY3AHDRr0whXTU= X-Gm-Message-State: AOJu0YzYSXgKG3bd/APLEg66wmb9xI8nPlZ0ysm363sAv6fzh1+Nti9k uVUQpbXLjzPRb4G5Xmbva/0U5rxzpNL9sKmSxUmWV0VaFRVy4Sk6JNLV2YvYzo+v+l2qxcLGXec QZgggirnNyBR0w03VYzU48G2yXzeQT5OQLcq2 X-Google-Smtp-Source: AGHT+IHEMfm5KZKzTuBuhbgn2BzM+Kkm0uGH+2NRDn+4bmV5rY5gAUDOiMxN7ccdXki/NJFJB+bUPiTfh36WFyShrO8= X-Received: by 2002:a05:6870:d152:b0:260:e2ed:1abe with SMTP id 586e51a60fabf-26891e929d5mr15828098fac.39.1722880495977; Mon, 05 Aug 2024 10:54:55 -0700 (PDT) MIME-Version: 1.0 References: <202408041602.caa0372-oliver.sang@intel.com> In-Reply-To: From: Jeff Xu Date: Mon, 5 Aug 2024 10:54:44 -0700 Message-ID: Subject: Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression To: Linus Torvalds Cc: kernel test robot , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , Kees Cook , "Liam R. Howlett" , Pedro Falcato , Dave Hansen , Greg Kroah-Hartman , Guenter Roeck , Jann Horn , Jeff Xu , Jonathan Corbet , Jorge Lucangeli Obes , Matthew Wilcox , Muhammad Usama Anjum , =?UTF-8?Q?Stephen_R=C3=B6ttger?= , Suren Baghdasaryan , Amer Al Shanawany , Javier Carrasco , Shuah Khan , linux-api@vger.kernel.org, linux-mm@kvack.org, ying.huang@intel.com, feng.tang@intel.com, fengwei.yin@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 6zgmqzetcjhyp6oaxkcjoak9w1a88ab1 X-Rspamd-Queue-Id: 2CF481C000E X-Rspamd-Server: rspam11 X-HE-Tag: 1722880496-666073 X-HE-Meta: U2FsdGVkX1/OJYl10Bp9ryt9ublhrGAUaAoOf4lAWAr0EXE7ABI76fIGzdj/UWZ3dm2XwovIuWvgjUgQjAVOblj14lBdjcmOe29lKRBrONaM+tpszYvuu+MrSkoeO9pbP72dVmfd6NO+iZwuXK3wBnU5eF3HgOOAVlyqrsP5gs8iUEq7ujuqktFQqpq9hTRjxoesBMbf+nv7hvLim9UvR9YLodgtohtpElhwuXxmM74ITO1We2RTkYLPcbfZ3zplbLzLs14u8JytvkOHYNhIw1gqZx3Mjlwd+n1HG9DqSHDW+GFdlCv3Ozptj4/Rr75jR0ILOGhShzw/36qfXemcbEYFaA/Hc/t1+7LB82ERwquRj6oUHB759LlODPw/VPqXpnQdpr26mtNXBp62KPLR4y5yRq0pUYa0e012+Rxhh87pgGc07UjLx6U/OsbrO3LD4Hp2Lqz/dG47Zb2ahySJJBeWif0U55YWaQhmAxQtQIMaUN99pp2Spcaf7S2ZR7cJhdK/gYnVqZoRz/29hVZKSaTF6cOd9qIHjdLigTNCQsunm4n9djXvYq205E92noap0/nw180ydL9I/4m0DnrrwsErmATKU2q+VnVOi6dRoPFEfvNk7jovRerRA3LmnRsKRiQC3EUBU4SuWG0lTF9Um4gDb0MBcJdKNCz/vonQRCHWqsgvqvU3vHouiT8EPF7xexw3U3bRcr4o4MlByHQ7ka3RzgzjV9FC2T+qzZLQ1KaRHBXbItG+OLKf+QQ4z17OAtjRmZmG1JYiQ+2MQsevGda9WjQxMnhCb9YxsCVydvIkf5GEK6ZdHLQhJ3GHS1vjfBKD7Zr3Y1qx5fcZ3NNd+6jvr4acmAK+wptdZBEUTs1wl6dUXCYrZ5HwQ8P69XFsGkVmi3VvClQunN9De9z/fC1DkfkwwKfae6vCED0Mm9iZALcHwW/HUGe0p5UmqduplzMDmCKDvBWgaJ/RZcB hqO1C6lp 9w/Q2E0Zvse8k0sVYLr/YguYPcPFZWK8P0IVBHkS2FOiQDOEYwCZ3jfn+t9wBArLu8Kj2jxDluH9Fc9ASSqFqO7QPohhRaUtI4w/HFL3FE8vnwKwGxScyM2xeZe3N43i2hVINFO8wRBtJaeVJtorgKA9GCtn962+3jWavIa48OkLd15mXM/V61TheMq+xwqA6QswMMV/aes3QnoZb4cL8SwLnprioQ5VjVyJGi8JPSNlGqOKSpJrykqwzEXNXMkw7T9Ov89hu55lXFtM7dVhq6eUuQ84kCrcKPJ0VHmZjkcaFJGlojoXZAWpukLeWQTi7UwNbrgUB6MBiodEa9sUxzjOg60/MhXk9OQOZsPcwClrWT3J3tz7jSStYTEQU/pf02iX/aFKJLdHVz6fGPRg2h7NCYsQURlSBFRnCQAXHWQ4dmEqH9fNmIZc8oE80VOP3wkAZ2H047w0gtW1j9YZg9lzP2a76UxSPDXQwXQvcZ8d9EaUgSavNSPc+Z18u2QoRoq0NSyt8l+whNjc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Aug 4, 2024 at 1:33=E2=80=AFPM Linus Torvalds wrote: > > On Sun, 4 Aug 2024 at 01:59, kernel test robot wr= ote: > > > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page= _remaps_per_sec on > > commit 8be7258aad44 ("mseal: add mseal syscall") > > Ok, it's basically just the vma walk in can_modify_mm(): > > > 1.06 +0.1 1.18 perf-profile.self.cycles-= pp.mas_next_slot > > 1.50 +0.5 1.97 perf-profile.self.cycles-= pp.mas_find > > 0.00 +1.4 1.35 perf-profile.self.cycles-= pp.can_modify_mm > > 3.13 +2.0 5.13 perf-profile.self.cycles-= pp.mas_walk > > and looks like it's two different pathways. We have __do_sys_mremap -> > mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the > destination mapping, but we also have mremap_to() calling > can_modify_mm() directly for the source mapping. > There are two scenarios in mremap syscall. 1> mremap_to (relocate vma) 2> shrink/expand. Those two scenarios are handled by different code path: For case 1> mremap_to (relocate vma) -> can_modify_mm , check src for sealing. -> if MREMAP_FIXED ->-> do_munmap (dst) // free dst ->->-> do_vmi_munmap (dst) ->->->-> can_modify_mm (dst) // check dst for sealing -> if dst size is smaller (shrink case) ->-> do_munmap(dst, to remove extra size) ->->-> do_vmi_munmap ->->->-> can_modify_mm(dst) (potentially duplicate with check for MREMAP_FIXED, practically, the memory should be unmapped, so the cost looking for a un-existed memory range in the maple tree ) For case 2> Shrink/Expand. -> can_modify_mm, check addr is sealed -> if dst size is smaller (shrink case) ->-> do_vmi_munmap(remove_extra_size) -> ->-> can_modify_mm(addr) (This is redundant because addr is already chec= ked) For case 2:, potentially we can improve it by passing a flag into do_vmi_munmap() to indicate the sealing is already checked by the caller. (however, this idea have to be tested to show actual gain) The reported regression is in mremap, I wonder why mprotect/munmap doesn't have similar impact, since they use the same pattern (one extra out-of-place check for memory range) During version 9, I tested munmap/mprotect/madvise for perf [1] . The test shows mseal adds 20-40 ns or 50-100 CPU cycle pre call, this is much smaller (one tenth) than change from 5.10 to 6.8. The test is using multiple VMAs with various types[2]. The next step for me is to run the stress-ng.pagemove.page_remaps_per_sec to understand why mremap shows a big regression number. [1] https://lore.kernel.org/all/20240214151130.616240-1-jeffxu@chromium.org= / [2] https://github.com/peaktocreek/mmperf Best regards, -Jeff > And then do_vmi_munmap() will do it's *own* vma_find() after having > done arch_unmap(). > > And do_munmap() will obviously do its own vma lookup as part of > calling vma_to_resize(). > > So it looks like a large portion of this regression is because the > mseal addition just ends up walking the vma list way too much. > > Linus