From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7A75C83F03 for ; Fri, 4 Jul 2025 18:18:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 821F16B8063; Fri, 4 Jul 2025 14:18:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7D1066B8059; Fri, 4 Jul 2025 14:18:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C10E6B8063; Fri, 4 Jul 2025 14:18:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 5921E6B8059 for ; Fri, 4 Jul 2025 14:18:25 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id EF8C8160225 for ; Fri, 4 Jul 2025 18:18:24 +0000 (UTC) X-FDA: 83627392128.19.84AB967 Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) by imf17.hostedemail.com (Postfix) with ESMTP id 09B8E4000B for ; Fri, 4 Jul 2025 18:18:22 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=BVIPTr69; spf=pass (imf17.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.222.179 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751653103; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XC/4+8+y/JD3aw5ERJm3ttJpqS7kWZzM7x3+YpNqmRM=; b=rs0D7d/NsW/NcEP6jEkrwJhD4upFkXDSlrOzbap5v9H1MLC6r9LVMRvMZ/GXDnr/jWP4Ej X0KPPZNPM+MHyTTybZpz9Ymk+RjbAox+SPz2BhqREQQ4HREttlCWt9Bi1CLl4Riv0MqB0K PNJuOyOeU26xMje+Tdw6qH3ThRS2Sq8= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=BVIPTr69; spf=pass (imf17.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.222.179 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751653103; a=rsa-sha256; cv=none; b=ZnPxaRPJMi26peAfWi2PcPrWPYsgDeggITecVcjnPfwX+mVz6QCo0bbpJd635+WF3akZGA jvJWTCp4dPoYXDhCktQIbxYS1CtItUfRKA0/qI4rLo/cPoap6PSpfG5YV+WEFGaA0dVOsX 6MFuvH0ieqUdzqC8ZfO76VmGWc29+AA= Received: by mail-qk1-f179.google.com with SMTP id af79cd13be357-7d45f5fde50so107990085a.2 for ; Fri, 04 Jul 2025 11:18:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1751653101; x=1752257901; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=XC/4+8+y/JD3aw5ERJm3ttJpqS7kWZzM7x3+YpNqmRM=; b=BVIPTr69num0g1OUibMonIXsYQWP2LAgJZ8p1pYCaVB9Z83zgHFziRYIhN6OpxQ4a+ T3QVOflX+bWkpWIMeRZFdzKnR0t+/bDPMGAGsXPkCgFDyxQ+N6o7CLoUqHjiTCNnN4jG mPThgUwnnyKCNa1wO9F0QllQPBxUo8XFYRybFy6M+EESOePExbTpgRICb7vgiebLyDzS O2v9JRAkbB1sVKRpL9p75KN+axcVGDfpJqHoQ0UAPsO0/xpDKhmvrxXuPRRdlsNujbx1 e/pBb3FBdIc+IWf5QETMNdwmgILqW7Mx/rMARqrSWrbSZfWDSZYuZcBQECFCmEnUO49u lt5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751653101; x=1752257901; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=XC/4+8+y/JD3aw5ERJm3ttJpqS7kWZzM7x3+YpNqmRM=; b=Rdqx0VUTlNTCjirwWRAnrhfpLkgsu0XeLvlUrQfzIQyUQPqQlq4RPIO1ohVXo8oNtV dZqdwaN23vo3iAuYz1vp9J9wX3B7LUk6UCQzh5X1cg1bQBA2qGSuvuf6qKwHZQ9XDnCo aePQZ3q40a9xwiWY9P/NjSZtXEpw5jDrdtIAlJN8UH2DvQktay1s0CQiV8z0TnzBK2eC if6cB4oYUDMuWH/WfRRgSpgeGo8rKO/YQWAwVkgXW4qA7AtDYoouyxAAA7VLOY8F0tOx EksiKHkSCqxIOaKsTw9DyW4GXDUg/kwTcRqxYL+oC/8L3BKmsEh9t1YBHrNJeJRs7F01 ZkTA== X-Gm-Message-State: AOJu0YyXlEhtloPMeV1G0xvUA6vzTY/tJAeO0ftVJNivDjtm/XmdvjaU 70r0OHdzdFwHZiS5DYQiX+foOc18Y8p29sdK3xtsIqd/CmSxaTWZ2SV74Z1ppauXaz4= X-Gm-Gg: ASbGnctG2kf2duJSFmQLpnD7ZQZbfkeHKCGQECes3WYonwIC1a3AxPsCEXJyYhvb+GU ak5LONtuvFf0F97B8zHEMVtPthbuuMK6/Oy0FH3/Ieor+V4oQob5P+pK+I7DYUlQpniUQm54xPt JctFPGJcbieC7siFOK5UvW8sBuyR8lHT/ujlG1mp8eI4ahxisDqqzvh5cJvEJi5JKnz4cS/9NN5 nqRowmTOWxywVGdU4rgt//Y3b6tUPeY2j5MWmevpGBlZ9AFMt6ZIhEW8ubKRO0feVz5xjNKGDSk NOGOgMnGUrUZskiIxoDIj2EzbWqpN3vUE+vy81ETGZi7fGBNQEGMQyNG7IPHSEtXlhc= X-Google-Smtp-Source: AGHT+IHIhs+aAYNm5nbtVGiYot+y4zzSvlKo3r5oS8teUGWvSp1aiRlv/9hyciyEEGHfYCfto31h6A== X-Received: by 2002:a05:620a:2482:b0:7d4:5c30:2acd with SMTP id af79cd13be357-7d5ddc2eb1fmr413391385a.28.1751653101143; Fri, 04 Jul 2025 11:18:21 -0700 (PDT) Received: from KASONG-MC4 ([101.32.222.185]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7d5dbe7c188sm183300585a.59.2025.07.04.11.18.16 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 04 Jul 2025 11:18:20 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v4 5/9] mm/shmem, swap: avoid false positive swap cache lookup Date: Sat, 5 Jul 2025 02:17:44 +0800 Message-ID: <20250704181748.63181-6-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20250704181748.63181-1-ryncsn@gmail.com> References: <20250704181748.63181-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: d4tambrh9jo4rytr3h4g1hupdrxo8mj3 X-Rspamd-Queue-Id: 09B8E4000B X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1751653102-102184 X-HE-Meta: U2FsdGVkX19iEz9/bKtcs7/wMRzTq/uHSTgJ6gXz2YnnAFFAjqWURg36yvubEeuQfP7IvNxJoHKuPXkNleUKil+grBVl4P3g43AD0B5iUUtlRWJM+WmAlz27SVdFJGLwQi1YuAs5wLDQNReJorUrKc8057SAi4H0/FkC8ebIRH3JUC1q/lhoo6pI61YASFB0Tp54zjqQCq0gIkA5Uwl79/AKf8flv8oSX/zl/I27kInkqxYou8xZZzPtYDo+rZzvUss9tHhel9iqZcfXT7277VkaMf46Ikd1sQdpsaIJVe8b9H2gvcSqmdQz0BdEnPaz4k1CXGemyHINSqjjZlekKXc7lEspe/j0QOCege01ERHO5cj0Ug6i+HjTMYgu5MmPnKp1XCf0T8V3ipbdaq49WoVr2G9MIQseOg2+uHtQ695+YbdMAuXVcCnx8C71qPOyx8LY5QYpOkP84Eahn+91ci3cCY7mJEpYq+16F3zZrH3pAQXdayS6o/7XdMwRBJwX+X6HvuGCBgyPJUNjZvlTmVMDdDg5pM/ncSazVYqsL6mmK8zu/ZttoutTcsE9AuGcmvtIb/nKlMpWitqxbHSwOCaYruZxdK4OPySKH4Ns4gScgNG5EeZ4xPWa91u595xad63Fgm80c0fUuiOlqdC/ToQrf2TjCcJGKzE+jAvOD0pVv2R4ZWEPrVQ/TF7qubMJ1uxTbMfzzBeVVwKq/NJY+UpqXx+x77CCodeGGHnvFhVzbtVtOwD/TyJW9BYhajysbLOuDfblxrQ2LNU0LvIPMzSWud4dCuCMwqFO30UA4kjRF313oAH5PG1qj8v7xiLqSkJ/lLWbpQk9WTtfp9WwITR62KtLJX4c5bFxHUdplJRxaefJrJj+jYSHupptKX1h9zm/SRMpewQe0TJzb4aq4MLwfU1dtdwgkJ6ggi4wimTw977DvKwEhiIvhn1b608F6pSJENX00ZGiupJnajB n8WCAF1J CgnoB8wXOJB+UIkIbE9IIH1g3q/LkcZs9D+H6SbjImqkSwIEGDYS1/KmCEEoVdWvxd2PVv7GjE9egw6BZGKgaZK0kus8ZuiqGkY5UlTPfiIriYdj1j4z+BW+T9aTKGD2NvU0S9Yy2EqS9XLlZklYBpIs8Y44Yzp3Y09DUCwSPcDHNhdbzSCsg6WqVf1wI5lvoZaKnWL0ROGvDUwfj+F9bVdsqfQzeGGkTVK4A15uL0nGRRSsvtuaTdRz45H33OM0AS6nQDNK5Lrl5H6XsrNoGDB4+nT1VqjR5sWKEAf+WIQ7Vwg4oCmYyo79pqvAhQUjvNo1ONDAXVYCzuEyKkWUEX7iB+7XjysRdjB8KX/tErIvzEcsW3Bjx0PBMJEIn2DArtozTPCopn8PGml4sqG/qvkwSaljUk3ZdNzw6NLl0kema7V1C03rArOf/gKVah3jP2oTo0oSJoPV2hzuJXEqz+DxC9Lo5D0wEuFgSQU7Osm3sbwgBgSoN+wp6jo7VCI+3wNNmFJLphCq9rhhvx1WOLDH6RqkFYnlbPbfJoiBWDvaxrKJJttpDVzB2YNPgBIAoWNaHPTK3TCXXXNLtMVT/UQ2JsPzM+NeSkE8DrrGtiHWLCfQFrErAJ3UBcP2Qcb/UBPXqofWxsVnpjP+wAeoLVrtRg+1E3970Sxsubq7qkS77ofron3liDXxFSA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song If a shmem read request's index points to the middle of a large swap entry, shmem swap in will try the swap cache lookup using the large swap entry's starting value (which is the first sub swap entry of this large entry). This will lead to false positive lookup results, if only the first few swap entries are cached but the actual requested swap entry pointed by index is uncached. This is not a rare event as swap readahead always try to cache order 0 folios when possible. Currently, shmem will do a large entry split when it occurs, aborts due to a mismatching folio swap value, then retry the swapin from the beginning, which is a waste of CPU and adds wrong info to the readahead statistics. This can be optimized easily by doing the lookup using the right swap entry value. Signed-off-by: Kairui Song --- mm/shmem.c | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 217264315842..2ab214e2771c 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2274,14 +2274,15 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, pgoff_t offset; VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); - swap = index_entry = radix_to_swp_entry(*foliop); + index_entry = radix_to_swp_entry(*foliop); + swap = index_entry; *foliop = NULL; - if (is_poisoned_swp_entry(swap)) + if (is_poisoned_swp_entry(index_entry)) return -EIO; - si = get_swap_device(swap); - order = shmem_confirm_swap(mapping, index, swap); + si = get_swap_device(index_entry); + order = shmem_confirm_swap(mapping, index, index_entry); if (unlikely(!si)) { if (order < 0) return -EEXIST; @@ -2293,6 +2294,12 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, return -EEXIST; } + /* index may point to the middle of a large entry, get the sub entry */ + if (order) { + offset = index - round_down(index, 1 << order); + swap = swp_entry(swp_type(swap), swp_offset(swap) + offset); + } + /* Look it up and read it in.. */ folio = swap_cache_get_folio(swap, NULL, 0); if (!folio) { @@ -2305,8 +2312,10 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, /* Skip swapcache for synchronous device. */ if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { - folio = shmem_swap_alloc_folio(inode, vma, index, swap, order, gfp); + folio = shmem_swap_alloc_folio(inode, vma, index, + index_entry, order, gfp); if (!IS_ERR(folio)) { + swap = index_entry; skip_swapcache = true; goto alloced; } @@ -2320,17 +2329,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, if (error == -EEXIST) goto failed; } - - /* - * Now swap device can only swap in order 0 folio, it is - * necessary to recalculate the new swap entry based on - * the offset, as the swapin index might be unalgined. - */ - if (order) { - offset = index - round_down(index, 1 << order); - swap = swp_entry(swp_type(swap), swp_offset(swap) + offset); - } - + /* Cached swapin with readahead, only supports order 0 */ folio = shmem_swapin_cluster(swap, gfp, info, index); if (!folio) { error = -ENOMEM; -- 2.50.0