From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 738B6C25B75 for ; Thu, 30 May 2024 00:50:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CCD9B6B0098; Wed, 29 May 2024 20:50:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C7D486B009B; Wed, 29 May 2024 20:50:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B44B86B009C; Wed, 29 May 2024 20:50:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 981E06B0098 for ; Wed, 29 May 2024 20:50:37 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 1E4FD1A0C38 for ; Thu, 30 May 2024 00:50:37 +0000 (UTC) X-FDA: 82173231714.27.271F715 Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92]) by imf22.hostedemail.com (Postfix) with ESMTP id 635F5C0009 for ; Thu, 30 May 2024 00:50:34 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717030235; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nILdz7DkRmGA/H6B2+m0UZiDV3FWhDttSnhkNt1vaj0=; b=B9U3zrTQLXy2HcxDx4s7Rl9IqPYRb80Fs5SxkyOsZwoRYcz4iO6xxLN6/xWd169Qwg8N5c tYwQwYbFzyDqsvXob4eAcNP2f/k3/k6/xBN8qAYRbmVd/9qgUoxLJw4Wr3ZuXR3SGLY3eI Ah1H0+d3SYbuvJscUgZnwL1teH/ekMc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717030235; a=rsa-sha256; cv=none; b=hPEuiqU5+sAm0uBt+s3clmlSlXIiBeaLrrAxaauGlHXACful0MNfqPNNGfbPblO6x3PONe xH5+FkGraIhN6tQdxxV+F0Ac5UF9QcGohDSi6K4vXSeTQHwJ4H0MVd5XhZrQ2M9JO3n6Ia 2zrPlLqgC3MReqntkjpyY1/tdPo0wG0= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com X-AuditID: a67dfc5b-d6dff70000001748-d4-6657cd57fc10 Date: Thu, 30 May 2024 09:50:26 +0900 From: Byungchul Park To: Dave Hansen Cc: "Huang, Ying" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel_team@skhynix.com, akpm@linux-foundation.org, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: Re: [PATCH v10 00/12] LUF(Lazy Unmap Flush) reducing tlb numbers over 90% Message-ID: <20240530005026.GA47476@system.software.com> References: <20240510065206.76078-1-byungchul@sk.com> <982317c0-7faa-45f0-82a1-29978c3c9f4d@intel.com> <20240527015732.GA61604@system.software.com> <8734q46jc8.fsf@yhuang6-desk2.ccr.corp.intel.com> <44e4f2fd-e76e-445d-b618-17a6ec692812@intel.com> <20240529050046.GB20307@system.software.com> <961f9533-1e0c-416c-b6b0-d46b97127de2@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <961f9533-1e0c-416c-b6b0-d46b97127de2@intel.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrIIsWRmVeSWpSXmKPExsXC9ZZnkW742fA0g0PTjC3mrF/DZvF5wz82 i08vHzBavNjQzmjxdf0vZounn/pYLC7vmsNmcW/Nf1aL87vWslrsWLqPyeLSgQVMFsd7DzBZ zL/3mc1i86apzBbHp0xltPj9A6j45KzJLA6CHt9b+1g8ds66y+6xYFOpx+YVWh6L97xk8ti0 qpPNY9OnSewe786dY/c4MeM3i8e8k4Ee7/ddZfPY+svOo3HqNTaPz5vkAviiuGxSUnMyy1KL 9O0SuDLWbhEs6BKomH5zH3sD4yqeLkZODgkBE4kTZ5azwdg3915kBbFZBFQlGtbtYAGx2QTU JW7c+MkMYosA2adWLmfvYuTiYBboZ5b4/w7E4eQQFgiRmPZhDROIzStgIbHg22ImkCIhgQdM ErO/PmSESAhKnJz5BGwqs4CWxI1/L4GKOIBsaYnl/zhAwpwCthInFp8CKxcVUJY4sO04E8Rx m9glPjwUh7AlJQ6uuMEygVFgFpKps5BMnYUwdQEj8ypGocy8stzEzBwTvYzKvMwKveT83E2M wIhcVvsnegfjpwvBhxgFOBiVeHgPSISnCbEmlhVX5h5ilOBgVhLhPTMpNE2INyWxsiq1KD++ qDQntfgQozQHi5I4r9G38hQhgfTEktTs1NSC1CKYLBMHp1QD45RzflkfSlbcZcq9n71Etn6L z53sf7wCoSy7pnMJP+ThOf3719NvR5h/6mTfZmz+M+/8PXXVuzldS+QZQ7t/Tb8zrbKbU+dg 9rU3L7JzmyOuf5KcnTXNZ+MGE02tlVGnW5ZyL5zZkbpWQbbcdrMx9+5ptgenlgV6Lmu+XPk9 cN2lM06dDxYuklViKc5INNRiLipOBABnIQgrxAIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprJIsWRmVeSWpSXmKPExsXC5WfdrBt+NjzNoPGqisWc9WvYLD5v+Mdm 8enlA0aLFxvaGS2+rv/FbPH0Ux+LxeG5J1ktLu+aw2Zxb81/Vovzu9ayWuxYuo/J4tKBBUwW x3sPMFnMv/eZzWLzpqnMFsenTGW0+P0DqPjkrMksDkIe31v7WDx2zrrL7rFgU6nH5hVaHov3 vGTy2LSqk81j06dJ7B7vzp1j9zgx4zeLx7yTgR7v911l81j84gOTx9Zfdh6NU6+xeXzeJBfA H8Vlk5Kak1mWWqRvl8CVsXaLYEGXQMX0m/vYGxhX8XQxcnJICJhI3Nx7kRXEZhFQlWhYt4MF xGYTUJe4ceMnM4gtAmSfWrmcvYuRi4NZoJ9Z4v87EIeTQ1ggRGLahzVMIDavgIXEgm+LmUCK hAQeMEnM/vqQESIhKHFy5hOwqcwCWhI3/r0EKuIAsqUllv/jAAlzCthKnFh8CqxcVEBZ4sC2 40wTGHlnIemehaR7FkL3AkbmVYwimXlluYmZOaZ6xdkZlXmZFXrJ+bmbGIERtqz2z8QdjF8u ux9iFOBgVOLhPSARnibEmlhWXJl7iFGCg1lJhPfMpNA0Id6UxMqq1KL8+KLSnNTiQ4zSHCxK 4rxe4akJQgLpiSWp2ampBalFMFkmDk6pBsY6kekStporl8bt+6xrMsN0ww/jFWscrCv7J3p9 njlZzvvO9Xze3qqOXsk9NVXzs5NyIv3mGS1ra/vm9DRHe95C2aNpfHbLlLU93P1cJ3j+Cwue yZq4o3yTU/vOmi8TT+abz3M5LaHIqmlxVPax9e0t4Z+NVHktVwh/Db5YK742km+J4L8VYUos xRmJhlrMRcWJAD8c/HisAgAA X-CFilter-Loop: Reflected X-Stat-Signature: sx4trsua1mjduccxkgjunwo3wcp1aipa X-Rspamd-Queue-Id: 635F5C0009 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1717030234-60286 X-HE-Meta: U2FsdGVkX18kMsfvgyEipiGWHKpI/Q7NVHLzTT+iQUC8HxS9u9L7zY9WQVICW5RYgiLB3qDv2Fp4TjjYcOFr7IJ9M5SNafZzPyD89N1NcuSBxxS70hou9JNZbw/idU70M6nHDYva79jMV2efFyacIqjeel4uqEATOJf+0yVhYHNqd+QCgW8X0o1cSM4rIodR2hIR2vhSvDH23+prn0fUSiIOyen1oIrlbYXqFc3l6l/+nJn3N1WdCPYgXi/rF0SvZDoRO59aGUrZAyDpXyYHwaf6keNmZ1cXc4OdgDzOrD24GecaO4X7WWAfo11I4RMcuLMdSwnJL/qBh/XeFGgLMOnyrOt8OwlHSyYSrXBb4D6JKYngFisXQMgou5Swk1ts0JY++heM4fSk9rJzdAxWeCyAS00hXxx796QOlVaiNQw8ye98xW5ibMpLmZYIqLJXLIE5oTnZXI65s3JYF3kxEfZtd3WaxtuzH94b2IPxIRAA8nSbU9YJyh03kdXZBunzSjAZRZmB2OWx/3Q4JDcAnfyLAbBuOIRERAHawbr965Uf0KlyWAwIsKkGIzB9InBTOMdrXSZVM8wpNVEYCQHTRZqWRNLC8TAMqD7DC06Q7R5TLcFxBoDBJLZr0z9c4y2MwC9hNpeac/QqQHwf3j3Xnx81zanw95jKhCqZjBbti3MaroofIjoqRasAJ44QmSixyZraatHKktDHgL51Knfq5u//ML9PkRAXfEcbLrKYeNKoNXN06b3I3Y4jGCdQpA5QXiLRS5gV9GsIHhkXHeNZ/0+NT2CHKk8NQ9p5t/SoidzC1MPvDNNJcfnNz9tzWlar1TwG1T+vpFaqARn8wC1t3zluM7Ppzc1pOLgWifgOSftqq/Ogudh1ZikNXc/5RqRn8XyrF9D7OIVcKW+HTdtTj7IxE5xWokVj X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 29, 2024 at 09:41:22AM -0700, Dave Hansen wrote: > On 5/28/24 22:00, Byungchul Park wrote: > > All the code updating ptes already performs TLB flush needed in a safe > > way if it's inevitable e.g. munmap. LUF which controls when to flush in > > a higer level than arch code, just leaves stale ro tlb entries that are > > currently supposed to be in use. Could you give a scenario that you are > > concering? > > Let's go back this scenario: > > fd = open("/some/file", O_RDONLY); > ptr1 = mmap(-1, size, PROT_READ, ..., fd, ...); > foo1 = *ptr1; > > There's a read-only PTE at 'ptr1'. Right? The page being pointed to is > eligible for LUF via the try_to_unmap() paths. In other words, the page > might be reclaimed at any time. If it is reclaimed, the PTE will be > cleared. > > Then, the user might do: > > munmap(ptr1, PAGE_SIZE); > > Which will _eventually_ wind up in the zap_pte_range() loop. But that > loop will only see pte_none(). It doesn't do _anything_ to the 'struct > mmu_gather'. > > The munmap() then lands in tlb_flush_mmu_tlbonly() where it looks at the > 'struct mmu_gather': > > if (!(tlb->freed_tables || tlb->cleared_ptes || > tlb->cleared_pmds || tlb->cleared_puds || > tlb->cleared_p4ds)) > return; > > But since there were no cleared PTEs (or anything else) during the > unmap, this just returns and doesn't flush the TLB. > > We now have an address space with a stale TLB entry at 'ptr1' and not > even a VMA there. There's nothing to stop a new VMA from going in, > installing a *new* PTE, but getting data from the stale TLB entry that > still hasn't been flushed. Thank you for the explanation. I got you. I think I could handle the case through a new flag in vma or something indicating LUF has deferred necessary TLB flush for it during unmapping so that mmu_gather mechanism can be aware of it. Of course, the performance change should be checked again. Thoughts? Thanks again. Byungchul