From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22F49C433EF for ; Fri, 15 Apr 2022 03:08:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 703096B0071; Thu, 14 Apr 2022 23:08:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B4756B0073; Thu, 14 Apr 2022 23:08:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57BD96B0074; Thu, 14 Apr 2022 23:08:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 493DA6B0071 for ; Thu, 14 Apr 2022 23:08:02 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 040B93D06 for ; Fri, 15 Apr 2022 03:08:01 +0000 (UTC) X-FDA: 79357629204.22.FDECB3F Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by imf10.hostedemail.com (Postfix) with ESMTP id 10520C0003 for ; Fri, 15 Apr 2022 03:08:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649992081; x=1681528081; h=message-id:subject:from:to:cc:date:in-reply-to: references:mime-version:content-transfer-encoding; bh=RAM36F8NkeMZBw4Ka4v8xDiTCpoZWQar458TDSkh+CI=; b=TMWSkPz0EOpO2ltNKFsGvlb2Yq3+XNU2N7WHuy3GWlbvy3CGM4zrtc23 JlnVertvzF+Hz/cjRkEwkpYJ06hW93rrFGo0JFNvA+WVQ2A268O0qB0Pb GCP+MaOwgZRWOVvbRnYAdMXGEFoZTh0mmJnfL7SEgGkGlkajWcppoOJNv OMLCfmQyVwYc1UWRYU90xt7u6TyxhCvJEwgkI8aSFb0kAVy56Q21r8jem zKRMTUUa30ORAmT49ODC9pUnH3hh52EFNQ0mahgxhI4A26qlmcg0/l8rL hWc/cUiPpiHldixY1Ad5sYKzTgkxyPQI7L6/IcUiA0jC97lYrwlTVgsPn Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10317"; a="263259947" X-IronPort-AV: E=Sophos;i="5.90,261,1643702400"; d="scan'208";a="263259947" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Apr 2022 20:07:59 -0700 X-IronPort-AV: E=Sophos;i="5.90,261,1643702400"; d="scan'208";a="552974365" Received: from ruiqifu-mobl.ccr.corp.intel.com ([10.254.213.123]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Apr 2022 20:07:55 -0700 Message-ID: <4b47e6317aca3deeabf610a7f4839563ff2b25a1.camel@intel.com> Subject: Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check From: "ying.huang@intel.com" To: David Hildenbrand , Miaohe Lin , Oscar Salvador Cc: akpm@linux-foundation.org, songmuchun@bytedance.com, hch@infradead.org, willy@infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Pavel Tatashin , John Hubbard , Linus Torvalds , Vlastimil Babka , Yu Zhao Date: Fri, 15 Apr 2022 11:07:53 +0800 In-Reply-To: References: <20220409093500.10329-1-linmiaohe@huawei.com> <20220409093500.10329-3-linmiaohe@huawei.com> <7455b680-3d89-5d3e-ba0e-6e4358b114a2@huawei.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.38.3-1 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=TMWSkPz0; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf10.hostedemail.com: domain of ying.huang@intel.com has no SPF policy when checking 192.55.52.115) smtp.mailfrom=ying.huang@intel.com X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 10520C0003 X-Stat-Signature: jbsy4kmcoqkxa5tpiie5mqrnszbs47ib X-HE-Tag: 1649992080-552878 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 2022-04-13 at 09:26 +0800, ying.huang@intel.com wrote: > On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote: > > On 12.04.22 15:42, Miaohe Lin wrote: > > > On 2022/4/12 16:59, Oscar Salvador wrote: > > > > On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: > > > > > We don't need to check can_split_folio() because folio_maybe_dma_pinned() > > > > > is checked before. It will avoid the long term pinned pages to be swapped > > > > > out. And we can live with short term pinned pages. Without can_split_folio > > > > > checking we can simplify the code. Also activate_locked can be changed to > > > > > keep_locked as it's just short term pinning. > > > > > > > > What do you mean by "we can live with short term pinned pages"? > > > > Does it mean that it was not pinned when we check > > > > folio_maybe_dma_pinned() but now it is? > > > > > > > > To me it looks like the pinning is fluctuating and we rely on > > > > split_folio_to_list() to see whether we succeed or not, and if not > > > > we give it another spin in the next round? > > > > > > Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be > > > pinned for a noticeable time. So it's expected to split the folio successfully in the next > > > round as the pinning is really fluctuating. Or am I miss something? > > > > > > > Just so we're on the same page. folio_maybe_dma_pinned() only capture > > FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now > > via vmsplice(). > > Per my original understanding, folio_maybe_dma_pinned() can be used to > detect long-term pinned pages. And it seems reasonable to skip the > long-term pinned pages and try short-term pinned pages during page > reclaiming. But as you pointed out, vmsplice() doesn't use FOLL_PIN. > So if vmsplice() is expected to pin pages for long time, and we have no > way to detect it, then we should keep can_split_folio() in the original > code. > > Copying more people who have worked on long-term pinning for comments. Checked the discussion in the following thread, https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQSj80_w@mail.gmail.com/ It seems that from the practical point of view, folio_maybe_dma_pinned() can identify most long-term pinned pages that may block memory hot- remove or CMA allocation. Although as David pointed out, some pages may still be GUPed for long time (e.g. via vmsplice) even if !folio_maybe_dma_pinned(). But from another point of view, can_split_huge_page() is cheap and THP swapout is expensive (swap space, disk IO, and hard to be recovered), so it may be better to keep can_split_huge_page() in shink_page_list(). Best Regards, Huang, Ying > > > can_split_folio() is more precise then folio_maybe_dma_pinned(), but > > both are racy as long as the page is still mapped. > > > > >