From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-185.mta0.migadu.com (out-185.mta0.migadu.com [91.218.175.185]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA96230F94D for ; Fri, 12 Jun 2026 06:46:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.185 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781246768; cv=none; b=nZzLW+2rPPfkpXLDrtjYxUUZZH4ZdG03bbty+sv7A5DIzl7QYQj7+oLYQOytA7QYm/qX1TArL3NyZNQQjeFmdmcKVDLrdmLuhCZpik8M5AAUUGGH7znOtCaPFIcCuafuA1Bmqza3OXB4zJyKJhOvaboTiWkPfzN8WRirWhIC+ZI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781246768; c=relaxed/simple; bh=cxdIIFmt2eaahEBbjC+YWuktzGBt4QUDqER6J/+UOMw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=buJoZp3DytRQb8QQFTPJeLu0wfbt7iSbNvCDtZlfuZk3aDoYZccryO85yqru65D3pJ7iLtRSsyZq/sl+rZFZqTMSwOEG4CIEB3Udcm0lEwdjkLbHkIVHxmMu2stCCk60z4RlEr2b7yL7Bnnh/yftZibpwTLd6g2h/Z9CYzfI3cM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=SzWq4ivU; arc=none smtp.client-ip=91.218.175.185 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="SzWq4ivU" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1781246761; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tqDd7CCu46FQ8ONM7WEjZvNrKJQRgFQen8hVZWzw+EQ=; b=SzWq4ivUXVPN4pB1Cpf3WddSPQPx3UELf13mNC3wAxXbwawJZVYDDZDLu14G7G8JvHS8Lc wZAh4erOzArQKan9EQ6jAcRPJ+0vZTAF4010mx6q08ZlHofVAmaxA+4BcnKo3jeKlnDtvY higrliLn8LnqYupwpUOBVbAgVlYY28I= From: Lance Yang To: usama.arif@linux.dev Cc: akpm@linux-foundation.org, david@kernel.org, chrisl@kernel.org, kasong@tencent.com, ljs@kernel.org, ziy@nvidia.com, ying.huang@linux.alibaba.com, baoquan.he@linux.dev, willy@infradead.org, youngjun.park@lge.com, hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, alex@ghiti.fr, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, liam@infradead.org, ryan.roberts@arm.com, vbabka@kernel.org, lance.yang@linux.dev, linux-kernel@vger.kernel.org, nphamcs@gmail.com, shikemeng@huaweicloud.com, kernel-team@meta.com, linux-mm@kvack.org Subject: Re: [v2 11/16] mm: handle PMD swap entries in non-present PMD walkers Date: Fri, 12 Jun 2026 14:45:50 +0800 Message-Id: <20260612064550.54968-1-lance.yang@linux.dev> In-Reply-To: <20260602142537.198755-12-usama.arif@linux.dev> References: <20260602142537.198755-12-usama.arif@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT +Cc linux-mm Please Cc linux-mm next time. Pretty clearly MM work ... On Tue, Jun 02, 2026 at 07:24:19AM -0700, Usama Arif wrote: [...] >diff --git a/mm/mincore.c b/mm/mincore.c >index e5d13eea9234..3fee8a7b9d9d 100644 >--- a/mm/mincore.c >+++ b/mm/mincore.c >@@ -172,7 +172,19 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, > > ptl = pmd_trans_huge_lock(pmd, vma); > if (ptl) { >- memset(vec, 1, nr); >+ if (pmd_present(*pmd)) { >+ memset(vec, 1, nr); >+ } else { >+ /* >+ * Non-present PMD: migration, device-private, or PMD >+ * swap entry. Route through mincore_swap() the same way >+ * the PTE path does -- the swap entry covers all 512 >+ * slots, so the whole vec gets the same answer. >+ */ >+ softleaf_t entry = softleaf_from_pmd(*pmd); >+ >+ memset(vec, mincore_swap(entry, false), nr); Looks buggy ... That assumes one swap-cache lookup is enough for whole PMD-sized range. I don't think that always holds ... See do_huge_pmd_swap_page(): ---8<--- folio = swap_cache_get_folio(swp_entry); [...] /* * Folio should be PMD-sized; if not (e.g. split in swap cache), * split the PMD swap entry and retry at PTE level. */ if (folio_nr_pages(folio) != HPAGE_PMD_NR) { folio_unlock(folio); folio_put(folio); goto split_fallback; } --- it handles the case where swap_cache_get_folio() returns a folio that is no longer PMD-sized. E.g. because it was split in the swap cache while the PMD swap entry was installed. Then it split the PMD swap entry and retries at PTE level :) unuse_pmd_entry() has the same fallback. Can mincore hit that case? Maybe the comment right above should say something like: " One lookup is enough for a PMD-sized swapcache folio. If the swapcache was split, check the per-page swap slots. " Hopefully, I'm not missing something here :D Cheers, Lance