From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-172.mta1.migadu.com (out-172.mta1.migadu.com [95.215.58.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEBAF23AE9B for ; Sat, 13 Jun 2026 19:18:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781378332; cv=none; b=OQyuDCWhuem8opbXC6MOg0LGbvjK9RdfRI6sNQTCwoh+ANQHzRTBueFi2sazGUVV2/niKLFph241PeBbH8l/4cNKwxeQvk0MlKu87Z0Nto7TyV1l2dctFHSf6fTUKDTS5nE1gPkYJrM59yCykRI2v9K+B7kpiJer6/5QWe3tc7M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781378332; c=relaxed/simple; bh=CRHOLn1Kq6ZGlcrKvvaaqj3LYbFR9QfiXLRvROCp9kQ=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=f8EHfa8xIJwYOrozeBeUJIaSwPKoI5rtfX1duTMx451SImEr0MYQ0yaHf3v5T0kbGKOtyAnvqjAOi5dmc1Iu+wZ4GB6CDJZ0Dsj52z7nf0kziQij/j3hFq/zQ2PIQah9diTLQYzLsKicFQGeoDsrUvNtQ0SmKvYSSayidwMXa1k= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=DDJpXDBR; arc=none smtp.client-ip=95.215.58.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="DDJpXDBR" Message-ID: <526fdbc0-1944-4328-9ff6-7922d021828d@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1781378327; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7JBe7FbuBZ2+oyRF5a0/aYVn2kgb02KwUrH6h6JKcJ4=; b=DDJpXDBRiuEzHKe0UUy6T2OiI/Fy/7qXp98iHPc4eIW8sGDNq37aH4re897W5GZBbY94NF kWq8PgZtBkiO5619nZZaTqkvc6mLwDRy2181pMQjnmC1MatCcFcZ83/20UJxs3Om5+ygt4 nFbq5mMW0g6VtfRr5Na9EmsIGSe5vWk= Date: Sat, 13 Jun 2026 20:18:34 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [v2 00/16] mm: PMD-level swap entries for anonymous THPs To: Lance Yang Cc: david@kernel.org, ying.huang@linux.alibaba.com, baoquan.he@linux.dev, willy@infradead.org, youngjun.park@lge.com, hannes@cmpxchg.org, riel@surriel.com, ljs@kernel.org, shakeel.butt@linux.dev, alex@ghiti.fr, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, linux-mm@kvack.org, akpm@linux-foundation.org, liam@infradead.org, ryan.roberts@arm.com, chrisl@kernel.org, vbabka@kernel.org, linux-kernel@vger.kernel.org, nphamcs@gmail.com, shikemeng@huaweicloud.com, kernel-team@meta.com, kasong@tencent.com, ziy@nvidia.com References: <680441bf-c878-4a00-8787-63ad8b201bc9@linux.dev> <20260613042232.93691-1-lance.yang@linux.dev> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Usama Arif In-Reply-To: <20260613042232.93691-1-lance.yang@linux.dev> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On 13/06/2026 05:22, Lance Yang wrote: > > On Wed, Jun 10, 2026 at 03:44:32PM +0100, Usama Arif wrote: >> >> >> On 10/06/2026 14:48, David Hildenbrand (Arm) wrote: >>> On 6/10/26 15:01, Lance Yang wrote: >>>> >>>> >>>> On 2026/6/10 20:24, David Hildenbrand (Arm) wrote: >>>>> On 6/9/26 16:29, Usama Arif wrote: >>>>>> >>>>>> >>>>>> >>>>>> Hello! >>>>>> >>>>>> Just following up if there were any reviews/comments on this series! >>>>>> >>>>>> I know its a large series but was just checking if there was any >>>>>> feedback? >>>>> >>>>> It shall be reviewed. We just finished the mTHP khugepaged review to get it into >>>>> 7.2, so we've all been rather busy. >>>> >>>> Right, mTHP khugepaged was a rough one. Glad we got it over the line, >>>> but yeah, there's just been a lot of THP work lately. pretty nonstop ... >>>> >> >> Yeah its definitely a lot. I have set a target of leaving review comments on >> atleast 2 patches from mm per day myself, but even that can sometimes be >> difficult! I will try and help out more in reviews. > > Awesome! > >>>>> (I mean, just take a look at the THP-related flood of patches we are fighting >>>>> with on a daily basis, it's not funny anymore) >>>>> >>>>> This is clearly going to be 7.3 material, so there is plenty of time given that >>>>> the merge window is about to open soon. >>>> >>>> Usama, I'll try to make this one a priority too. Looks interesting :P >> >> Thanks Lance! >> >>> >>> I have two other bigger series to review, but I should soon get to this as well. >>> >> >> No worries at all! Thanks for the reviews! and yeah definitely 7.3. >> >> I will send this out again when 7.3-rc1 opens (rebased), so that the reviews wont be on >> outdated code which could cause some confusion. > > After skimming through the whole series, probably PMD swap entries need > one bigger rethink ... > > Emm ... same tricky bit keeps showing up ... > > One PMD swap entry is easy to handle while the swapcache still has one > PMD-sized folio behind it. Once taht folio got split and reclaimed, the > 512 swap slots need per-page handling :) > > Maybe worth first pinning down the rule here. > > Is a PMD swap entry supposed to mean "there is, or soon will be, one PMD- > sized folio behnid it", or is just a compact page-table encoding for > 512 swap slot? > > Without that rule being very clear, every caller has to guess how much > it can assume, and it is easy to miss one ... > > So I stopped staring at the details for now, because the same issue keeps > popping up wearing a slightly different hat :) > > Anyway, no clever answer from me here, not a swap expect :( Just pointing > out the pattern I keep runing into. > Thanks for the amazing reviews! For the next revision I’m going to treat a PMD swap entry as just a compact page-table encoding for 512 ordinary swap slots. It does not mean that the swapcache still has, or will soon have, one PMD-sized folio behind it. With that rule, whole-PMD handling is only valid when either: 1. the swapcache still has one PMD-sized folio for the range, or 2. the whole PMD swap range has no cached folios, so the caller can try a PMD-sized swapin and still fall back if that is not possible. If any slot in the range has per-page cache state, the PMD entry has to be split and the existing PTE paths need to handle the individual slots. I an reworking the next revision around that. I added a shared helper to classify the swapcache behind a PMD swap entry as empty, PMD-sized, or split, then used it in the places where this assumption mattered: mincore, UFFDIO_MOVE, swapoff, MADV_WILLNEED, and the PMD swap fault path. UFFDIO_MOVE now checks the whole 512-slot range before moving a PMD swap entry without a cached folio, and falls back to PTE handling if per-page cached folios exist. Thanks! Usama