From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4281EC83F07 for ; Mon, 7 Jul 2025 08:05:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D75ED8D000C; Mon, 7 Jul 2025 04:05:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D276B8D0002; Mon, 7 Jul 2025 04:05:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C64D98D000C; Mon, 7 Jul 2025 04:05:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id AE35B8D0002 for ; Mon, 7 Jul 2025 04:05:51 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6049F1D90E7 for ; Mon, 7 Jul 2025 08:05:51 +0000 (UTC) X-FDA: 83636734902.28.0C8E6E8 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) by imf11.hostedemail.com (Postfix) with ESMTP id EB0C240006 for ; Mon, 7 Jul 2025 08:05:47 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=dkDTcdIM; spf=pass (imf11.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.112 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751875549; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3Pzd5KEVdO94fQQ2PxE2ZqO2v+Merdc/BNYLsIfYkjo=; b=RFlBdM2W++a8vYnvMg9wkLS2XbdPKrgqQty23v5FSeWR2fkBZANOJ0423SONwNgcgSu0Yr ro8PogxK892luNJPsEBP/bP+i8sm4qy8SAzwTfiCfFpvSmfSz52vOT5MRLnOHRHkew8Avi qYRZE+dqst9XNCtBRYEPVwul2eTFKu0= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=dkDTcdIM; spf=pass (imf11.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.112 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751875549; a=rsa-sha256; cv=none; b=iSrwKc13g0bTsKuba2Z7NLNPs+fpDyUN5/t9mETAmf85YArEq9qfDhDkpT3VanFCEmpntv iZJi9BETKroxkxyyM3upWB7a3Q3KyDCnKbswBY9/h4y1BkpysZfD2T+fPo4FPpu7R/okEi dYCQ4Dl5WRynv8khxAKNhDFOxqWA4Q8= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1751875543; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=3Pzd5KEVdO94fQQ2PxE2ZqO2v+Merdc/BNYLsIfYkjo=; b=dkDTcdIMFCCpCDO1uFmnmuRPsA1SGF93+HvFvM8Z9sYnu2UJUnYYOoWrQQbWyrCV8I6f/C92oJDRGcxQfMuMcAxTWieuoTcXABlpYQtF3D0pSJMDLP3+zpa8dHWovz16yXeiLHW+QuGwp2pkmbjYs/hpTZLjrUbRagRgsKN2w20= Received: from 30.74.144.127(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0Wi3WIrV_1751875540 cluster:ay36) by smtp.aliyun-inc.com; Mon, 07 Jul 2025 16:05:40 +0800 Message-ID: <2b1c5548-bf66-4d4a-a379-d9c6bf35283c@linux.alibaba.com> Date: Mon, 7 Jul 2025 16:05:39 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 6/9] mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO To: Kairui Song , linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org References: <20250704181748.63181-1-ryncsn@gmail.com> <20250704181748.63181-7-ryncsn@gmail.com> From: Baolin Wang In-Reply-To: <20250704181748.63181-7-ryncsn@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Stat-Signature: 3k4guo59f5ubfjntec8391eipa11ayg9 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: EB0C240006 X-HE-Tag: 1751875547-800471 X-HE-Meta: U2FsdGVkX192rX6p1+K0kzgxpqeHHSL2JPUhKVndlCdM2QuUqjTSKmGmYtbKshHwlrm4QwEUjniDGAA1gWGz5Rq9Z1i72N1YFjiickMU/q3qiMiZPUHoW2381+TfVpgugAXMtDJ0UUMp2YWkMVR/wYIa9C1JrQnu2ztrF5En2fGsUOA3cj1Degodce6xoPob7+u2goW8nYZHhuQi9qihDgo8kXN9M7uDUrOG7gUp9KiiNSCbT5th6Wcb56YXMwe9DtKN1I6V2pf3BX8opoo5JDqiwDFIg24SRnS1SWMMiME/wLjEYEErbVa5U7hIgbvuR/bBRil4rwmhHwzmM2x/3Zap1l54z4mmYJvrPoTixgDUGgPe18gDsMhgBRWnJnbosHuOQSk+UbxmmgxTn7jaY7r//Fm7d2div87jp0c1ANu3ciBESG1dR7sNTi3P8tAHCoHBGMBJBHWVgknVQ10XA3O4uVnA/26IAuGucE8DTVfdxeDPdOJn1SDKTf8F9ATEVdnAjsLIXpN7qLPIGfpOIUMXcZ3d63lRnfLpRYCd4PzrHsoIgOT9Ucwb/5HgGFmhPtKHlNl+vzAinuTbRNo1COnOv6ISdH2MyZLwt538PlNkgk6v8Z2ng1xwIVArzMUWMggnBwEvPGS19xQKnKjD7n6dp9p9P7pX7kNJLB39ap3wHD8UxhTG39Dy26tAQ0sjZ/h0wngM4/10ckxbz7n0ks2UHt2R5Sh2DaXxnjXeicM/+jaJm4SzPblBwHRzwz3/p4drgNsw6eSl8v8bJ/0pgvrqllXTjxQKAy0xOufa4jKc8Qxj9583gSUKoDL295sKcnWfkxwRTd6IvyRGEvXoaMUkR7vdel3YXTdTGJ6azYbQM0OVO9q+XtZi24jkaO4XPaNZ6KtQeuhVlmgYM5NesdQq/I+xbaS+nEuKYmKViC1I679aPtVzcOHtAnnj4RoEJOaa6nGlMst3rEoGFFl hNvSNHdg n7EvlGzQeqfNCjUUQ7QcfbowpQFjkFEmHSt6j5F7Cm/FI1NmdoT2Dug3dPL/RzibbCtJ1d4SHpxO1MmSFLKe6PyArithfM78NCrSY7XeiWiLp7XSjgyQX+SpEO7lzD1ly062y1yMyARh1TZ4QiamY22ZtfvfR4WJeZ3z9t1Faztz43XnX5ZOumpyDu07P8L5bJLx2EVy493WpQgiUmqoK2rGCJ17sd40JapIss5jmpRbyfNq1+xN94JGKUgG5mkVtNxNPYsCqSAKEq3jKO1fmEy4CGhkvt2WExCvEBt1A6Yv5reLc5XkpuWImtFYlvcNheI1kBMSDEebSJwENqlzKDm/PV1g+6UR3F7nj4JVKLAdE5DkEV1Tpo8vvNJlFawIH3Tp1v97pRd4Wff8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/7/5 02:17, Kairui Song wrote: > From: Kairui Song > > Currently if a THP swapin failed due to reasons like partially > conflicting swap cache or ZSWAP enabled, it will fallback to > cached swapin. > > Right now the swap cache still has a non-trivial overhead, and readahead > is not helpful for SWP_SYNCHRONOUS_IO devices, so we should always skip > the readahead and swap cache even if the swapin falls back to order 0. > > So handle the fallback logic without falling back to the cached read. > > Signed-off-by: Kairui Song > --- > mm/shmem.c | 55 +++++++++++++++++++++++++++++++++++------------------- > 1 file changed, 36 insertions(+), 19 deletions(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 2ab214e2771c..1fe9a3eb92b1 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -1975,13 +1975,16 @@ static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, > return ERR_PTR(error); > } > > -static struct folio *shmem_swap_alloc_folio(struct inode *inode, > +static struct folio *shmem_swapin_direct(struct inode *inode, > struct vm_area_struct *vma, pgoff_t index, > - swp_entry_t entry, int order, gfp_t gfp) > + swp_entry_t swap, swp_entry_t index_entry, IMO, 'swap' and 'index_entry' are confusing, and it's easy to be unclear about their roles. I suggest only passing the original swap value. If it falls back to order 0, the swap value can be recalculated, which is more readable as well as maintaining the independence of the function. > + int order, gfp_t gfp) > { > struct shmem_inode_info *info = SHMEM_I(inode); > + swp_entry_t entry = index_entry; > int nr_pages = 1 << order; > struct folio *new; > + gfp_t alloc_gfp; > void *shadow; > > /* > @@ -1989,6 +1992,7 @@ static struct folio *shmem_swap_alloc_folio(struct inode *inode, > * limit chance of success with further cpuset and node constraints. > */ > gfp &= ~GFP_CONSTRAINT_MASK; > + alloc_gfp = gfp; > if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { > if (WARN_ON_ONCE(order)) > return ERR_PTR(-EINVAL); > @@ -2003,19 +2007,22 @@ static struct folio *shmem_swap_alloc_folio(struct inode *inode, > if ((vma && unlikely(userfaultfd_armed(vma))) || > !zswap_never_enabled() || > non_swapcache_batch(entry, nr_pages) != nr_pages) > - return ERR_PTR(-EINVAL); > + goto fallback; > > - gfp = limit_gfp_mask(vma_thp_gfp_mask(vma), gfp); > + alloc_gfp = limit_gfp_mask(vma_thp_gfp_mask(vma), gfp); > + } > +retry: > + new = shmem_alloc_folio(alloc_gfp, order, info, index); > + if (!new) { > + new = ERR_PTR(-ENOMEM); > + goto fallback; > } > - > - new = shmem_alloc_folio(gfp, order, info, index); > - if (!new) > - return ERR_PTR(-ENOMEM); > > if (mem_cgroup_swapin_charge_folio(new, vma ? vma->vm_mm : NULL, > - gfp, entry)) { > + alloc_gfp, entry)) { > folio_put(new); > - return ERR_PTR(-ENOMEM); > + new = ERR_PTR(-ENOMEM); > + goto fallback; > } > > /* > @@ -2030,7 +2037,9 @@ static struct folio *shmem_swap_alloc_folio(struct inode *inode, > */ > if (swapcache_prepare(entry, nr_pages)) { > folio_put(new); > - return ERR_PTR(-EEXIST); > + new = ERR_PTR(-EEXIST); > + /* Try smaller folio to avoid cache conflict */ > + goto fallback; > } > > __folio_set_locked(new); > @@ -2044,6 +2053,15 @@ static struct folio *shmem_swap_alloc_folio(struct inode *inode, > folio_add_lru(new); > swap_read_folio(new, NULL); > return new; > +fallback: > + /* Order 0 swapin failed, nothing to fallback to, abort */ > + if (!order) > + return new; > + order = 0; > + nr_pages = 1; > + alloc_gfp = gfp; > + entry = swap; > + goto retry; > } > > /* > @@ -2309,25 +2327,24 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, > count_vm_event(PGMAJFAULT); > count_memcg_event_mm(fault_mm, PGMAJFAULT); > } > - Nit: do not add unnecessary change. > /* Skip swapcache for synchronous device. */ > if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { > - folio = shmem_swap_alloc_folio(inode, vma, index, > - index_entry, order, gfp); > + folio = shmem_swapin_direct(inode, vma, index, swap, > + index_entry, order, gfp); > if (!IS_ERR(folio)) { > - swap = index_entry; > + if (folio_test_large(folio)) > + swap = index_entry; > skip_swapcache = true; > goto alloced; > } > > /* > - * Fallback to swapin order-0 folio unless the swap entry > - * already exists. > + * Direct swapin handled order 0 fallback already, > + * if it failed, abort. > */ > error = PTR_ERR(folio); > folio = NULL; > - if (error == -EEXIST) > - goto failed; > + goto failed; > } > /* Cached swapin with readahead, only supports order 0 */ > folio = shmem_swapin_cluster(swap, gfp, info, index);