From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF0CAC25B75 for ; Thu, 30 May 2024 01:00:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 839DA6B009E; Wed, 29 May 2024 21:00:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E9B06B009F; Wed, 29 May 2024 21:00:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 68A566B00A0; Wed, 29 May 2024 21:00:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4C4116B009E for ; Wed, 29 May 2024 21:00:09 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id C6885160C88 for ; Thu, 30 May 2024 01:00:08 +0000 (UTC) X-FDA: 82173255696.14.564F3A1 Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf12.hostedemail.com (Postfix) with ESMTP id 0CE7540027 for ; Thu, 30 May 2024 01:00:05 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf12.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717030807; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sptzKmDCPx6da5hH/TBUA82LOuAmZrEYG0iaMTGpL70=; b=cMscNoQgrqZtnGAhpB/UOWqR9KbfgoD8dH3xJwefC6JURQkjagnkJ/t867Egzg/xpSKSg7 ezU+d/quAXIokdNRxBllXoKSAnraJ1IJvN64xwNImgEEnMm2TWHkLA3jcZF1IkICxcfD27 9FGvUhZHoTMThCYM8R4MN1fbT3O/tY0= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf12.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717030807; a=rsa-sha256; cv=none; b=bEVXU38+C61mjJKV1oZHFJUmY/3/6Al4zERV1pTsvzhGaHiTqYP8H/izDZqxSPjAkh25lL M6OuXmsk8w9f/2wAqAP9OtJspnRnhc5Ypb/N/4i6r55LqTwbfGHEJuwTtr1SskWYJbhs4N szsp7cuYhgoAfN5kGHo+7Z6jVdkCp6o= X-AuditID: a67dfc5b-d6dff70000001748-bc-6657cf93d748 Date: Thu, 30 May 2024 09:59:58 +0900 From: Byungchul Park To: Dave Hansen Cc: "Huang, Ying" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel_team@skhynix.com, akpm@linux-foundation.org, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: Re: [PATCH v10 00/12] LUF(Lazy Unmap Flush) reducing tlb numbers over 90% Message-ID: <20240530005958.GB47476@system.software.com> References: <20240510065206.76078-1-byungchul@sk.com> <982317c0-7faa-45f0-82a1-29978c3c9f4d@intel.com> <20240527015732.GA61604@system.software.com> <8734q46jc8.fsf@yhuang6-desk2.ccr.corp.intel.com> <44e4f2fd-e76e-445d-b618-17a6ec692812@intel.com> <20240529050046.GB20307@system.software.com> <961f9533-1e0c-416c-b6b0-d46b97127de2@intel.com> <20240530005026.GA47476@system.software.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240530005026.GA47476@system.software.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrMIsWRmVeSWpSXmKPExsXC9ZZnoe7k8+FpBgtfaFvMWb+GzeLzhn9s Fp9ePmC0eLGhndHi6/pfzBZPP/WxWFzeNYfN4t6a/6wW53etZbXYsXQfk8WlAwuYLI73HmCy mH/vM5vF5k1TmS2OT5nKaPH7B1DxyVmTWRwEPb639rF47Jx1l91jwaZSj80rtDwW73nJ5LFp VSebx6ZPk9g93p07x+5xYsZvFo95JwM93u+7yuax9ZedR+PUa2wenzfJBfBFcdmkpOZklqUW 6dslcGV82HmIrWCJUMW0zh2MDYyveLsYOTkkBEwkFnTtZYGxd095xQZiswioSvw/d5ERxGYT UJe4ceMnM4gtAmSfWrmcvYuRi4NZoJ9Z4v87EIeTQ1ggRGLahzVMIDavgIXEs4fbmUGKhECK Fu0+xwqREJQ4OfMJ2DZmAS2JG/9eAjVwANnSEsv/cYCEOQUsJT49PwS2WFRAWeLAtuNMIHMk BLaxS2x6O5UR4lJJiYMrbrBMYBSYhWTsLCRjZyGMXcDIvIpRKDOvLDcxM8dEL6MyL7NCLzk/ dxMjMCqX1f6J3sH46ULwIUYBDkYlHt4DEuFpQqyJZcWVuYcYJTiYlUR4z0wKTRPiTUmsrEot yo8vKs1JLT7EKM3BoiTOa/StPEVIID2xJDU7NbUgtQgmy8TBKdXAWPPj0K+elJUWXJ4B3/9V imxyW3SQ88q5lMUVOWV58zc8vPVPVu7gw7C74jvMYr4cNfbl/ZCnYvh0jaHAij1+iZLRjs/7 YngSdlWrpaauP/C0sWP9qd35ToE/JSewybkJss2fes99Tpe9wLK05YVxX0Nep3svfMil8fuC UdXFZ2KTFpa/qnrcpMRSnJFoqMVcVJwIAKHbtz3GAgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprFIsWRmVeSWpSXmKPExsXC5WfdrDv5fHiawekuSYs569ewWXze8I/N 4tPLB4wWLza0M1p8Xf+L2eLppz4Wi8NzT7JaXN41h83i3pr/rBbnd61ltdixdB+TxaUDC5gs jvceYLKYf+8zm8XmTVOZLY5Pmcpo8fsHUPHJWZNZHIQ8vrf2sXjsnHWX3WPBplKPzSu0PBbv ecnksWlVJ5vHpk+T2D3enTvH7nFixm8Wj3knAz3e77vK5rH4xQcmj62/7Dwap15j8/i8SS6A P4rLJiU1J7MstUjfLoEr48POQ2wFS4QqpnXuYGxgfMXbxcjJISFgIrF7yis2EJtFQFXi/7mL jCA2m4C6xI0bP5lBbBEg+9TK5exdjFwczAL9zBL/34E4nBzCAiES0z6sYQKxeQUsJJ493M4M UiQEUrRo9zlWiISgxMmZT1hAbGYBLYkb/14CNXAA2dISy/9xgIQ5BSwlPj0/BLZYVEBZ4sC2 40wTGHlnIemehaR7FkL3AkbmVYwimXlluYmZOaZ6xdkZlXmZFXrJ+bmbGIExtqz2z8QdjF8u ux9iFOBgVOLhPSARnibEmlhWXJl7iFGCg1lJhPfMpNA0Id6UxMqq1KL8+KLSnNTiQ4zSHCxK 4rxe4akJQgLpiSWp2ampBalFMFkmDk6pBsa8cxueRnnZ/GnRXsJ2wXfqK3ebW9qWv/Mddksu ni5zRiHYr2OqTNnZGAlv21/3P24oLLnNF+1vYf/z3IZF+ddr9/g5Ccz0qKtg6Ngqt81TO26t y0Vpyf3pMczfXEKmL1vstZbDuGtJxEWDVzG54S+8K6QmOWvNlda58Sa5+WHkfqXG/aqLjiux FGckGmoxFxUnAgAwX/lwrQIAAA== X-CFilter-Loop: Reflected X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 0CE7540027 X-Stat-Signature: mfwtcuwrj41b8ubbuhabonpo8ouoxs59 X-Rspam-User: X-HE-Tag: 1717030805-744993 X-HE-Meta: U2FsdGVkX1/6tn9Q/Gbw20YvQSbAwf3vOWekdwJDTYTMNAjS1kdA67BOyiwLxC39IMseaFJhDG0yZQRIc1Dh6AcV9Lgx93Y2hf4AoqIbyR3/GeOaqCsdMCo4SuWk0hPxrvm4mjh1iHshRsHJBgL3EsByHn1Edc1ct1J6bchvmpw+EFX4bpOVSq8PcvLF5ztumhgygCnbRQuinsSQ80DeyA5JhoDuqQUcfv03g2Z/x5ujnwYuGz5b+cnBx7IU5RCVAcoPR5H6IPw0+giaNHseDBNU3nq1gALPdsyFcAxxFvJx2EBm5+uXuIVXIBa/uu3d1trxw7J9s1Di9XiJv0joU01dq863zrcDtnyhuGRkxXAc2NdZSfUjm/9B1Xz6imkx5MRyx5i15T+M6QlbvL6ZHjbmo+RccyMOlgRD+XQaFOeI+gcfezS8ctrOiIGO3/UfVZuBgDvSykD8GYXQPzTzPIXmKZ1Sb+LqyLzXsdiqa/33vyRCTeDUNVjkc77l2tWFZkYKgQILGX/bXq+3NCz2uaWrF2+xJB7bhKj8JMpVJ4tOeqnZ0a8OicOMTKIXv4HsVFcOGBesolx8MiS7EqsnPR5hx6I6DdWWj+zx5RqD7Wv21lBKeFLTOTCHvGowDqDjAPAe866yaERDMk/VWvfdOGAbCj2vC6IaHaHkp8cFYsaaGVSOO9ulq6LbklBGAsmZcftpa40gxwtceEBUKicWQ1UfP/BvjUsmBuX+I2jiFKAfPVGtVVmmPGzKYaFse63BMvIVpi1AA3cNYcupT6CIetyHCuLIi2t+NcA1X75D6ZUzhwg+WWERNW1/iLiawDJELa0fmH6yCzZybQHOVZOdfSrwAMq8bGyZj0YL0na8aMmx/mSp3+fIn+hlFbjtT5JLyGay5rmFnDXUxAFKP3eS0Ay4al6aWkJX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, May 30, 2024 at 09:50:26AM +0900, Byungchul Park wrote: > On Wed, May 29, 2024 at 09:41:22AM -0700, Dave Hansen wrote: > > On 5/28/24 22:00, Byungchul Park wrote: > > > All the code updating ptes already performs TLB flush needed in a safe > > > way if it's inevitable e.g. munmap. LUF which controls when to flush in > > > a higer level than arch code, just leaves stale ro tlb entries that are > > > currently supposed to be in use. Could you give a scenario that you are > > > concering? > > > > Let's go back this scenario: > > > > fd = open("/some/file", O_RDONLY); > > ptr1 = mmap(-1, size, PROT_READ, ..., fd, ...); > > foo1 = *ptr1; > > > > There's a read-only PTE at 'ptr1'. Right? The page being pointed to is > > eligible for LUF via the try_to_unmap() paths. In other words, the page > > might be reclaimed at any time. If it is reclaimed, the PTE will be > > cleared. > > > > Then, the user might do: > > > > munmap(ptr1, PAGE_SIZE); > > > > Which will _eventually_ wind up in the zap_pte_range() loop. But that > > loop will only see pte_none(). It doesn't do _anything_ to the 'struct > > mmu_gather'. > > > > The munmap() then lands in tlb_flush_mmu_tlbonly() where it looks at the > > 'struct mmu_gather': > > > > if (!(tlb->freed_tables || tlb->cleared_ptes || > > tlb->cleared_pmds || tlb->cleared_puds || > > tlb->cleared_p4ds)) > > return; > > > > But since there were no cleared PTEs (or anything else) during the > > unmap, this just returns and doesn't flush the TLB. > > > > We now have an address space with a stale TLB entry at 'ptr1' and not > > even a VMA there. There's nothing to stop a new VMA from going in, > > installing a *new* PTE, but getting data from the stale TLB entry that > > still hasn't been flushed. > > Thank you for the explanation. I got you. I think I could handle the > case through a new flag in vma or something indicating LUF has deferred > necessary TLB flush for it during unmapping so that mmu_gather mechanism > can be aware of it. Of course, the performance change should be checked > again. Thoughts? I will check the existing optimization of TLB flsuh more in arch level and suggest a better way. Byungchul > Thanks again. > > Byungchul