From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f68.google.com (mail-pj1-f68.google.com [209.85.216.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F0392DC783 for ; Wed, 13 May 2026 03:24:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.68 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778642665; cv=none; b=n7ziBumlNaiOeNU9UFUrm8XoX1BBShUpAMTg9o1BDHX+jQj+JU/hIRyhsEwvlVNo7+h+gQLm78tTazg90AJj151yR0057GUqpRL3fhkjQad2MEod4ofjVoR9O6LdC6FdcnerWGjF7mMi7VIqnOEDWHpqObaT2Aarty49RKs2HQ8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778642665; c=relaxed/simple; bh=vkw7Mg4NH6wOXP682rKxxXMufrUM/rcdYzyThgo+0wI=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=NhU+l0kbj1t82silTnFPBoLjOr1cMQnTTaCbGwcEREDKw/aKUeLiG8jfw2tr5Wy8MCwYXuKBdgZg+7hy2iBOHsMArpUdBS0QCqootPZB9ITDAAlnLt0YBG8PyXZzXNyA72QdtmGlZxBWdjdDQa8xktf54G2HV+p9Neui16n+vU0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JualXIcy; arc=none smtp.client-ip=209.85.216.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JualXIcy" Received: by mail-pj1-f68.google.com with SMTP id 98e67ed59e1d1-367cbac9cb1so4018564a91.3 for ; Tue, 12 May 2026 20:24:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778642664; x=1779247464; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=shxQwZMWxb7nA4OQov3Bopteic25gB7G6Th6kE4YmRc=; b=JualXIcyD6kzGo6uARbs4nTfkw59hPLzLFuVB8iMZqR2alFrQNy8spjISdOBHYOJec +g1AZPONC372vGVhRuKZCMYeU6okAAQgPDXVZl2QWZJUF72V/rBWO7skjECzxiwrP0vt hdRp5z2XAKBnIg9yXIa4cAjzs0bhoUANCRC8mkVby1hZEXeMzqqtRJ8Srzj9Dtnu7jUD gc0YCxmeKxdhXceQ/vgZWHyqgnZIN7cjDZprZIvmKC3mXv4ggx+VGYLAGVg/FTHZc5Bc leTU3RqbL2NLHQiAOFHqf2ngGoQVXuzS8hdqn4A9uogsHt+nC7LMJDfOWqTRzpOFPHSp RGJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778642664; x=1779247464; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=shxQwZMWxb7nA4OQov3Bopteic25gB7G6Th6kE4YmRc=; b=L+bPcdJ8T7F9BGw0hPVIUe39Jq6JA1e9zEmPBwwg9zKIHLngul+IIC2kGjMQrsiz+k ogOkKhcfNReDN76aLxD19xBTLER0AeZdf80XV8v/tK+CAnYlTnDshHyvd0ynU6DVJF97 ZlPn1guiaErp6hmBpp+PdBeUcHK8tKprTi0z2gKDrbCCVZGoHVOEsBw90X1/hjpxYDOU 2bF+lh3iaGg+tOP20mniWVg4rGynqYHcD3m5CIeA301fZPj5TZcxRXQ689e+FZeYO0vv J/DkM2YTTwOecp2SDgG/ffvMVTeD1T8szr1PWLJXmOyPWGnpyRXLGz1yheYMGv2Q0WhG U/uw== X-Forwarded-Encrypted: i=1; AFNElJ9ZhUsOELMh8Xh+YUyXN+9LhDFOQfhvu7vNELIG4LTjXL/XBSZ8gPVnxhbQiLEUOTWWq1pH6hKpmXL11CI=@vger.kernel.org X-Gm-Message-State: AOJu0YyBt+edFzyN1oDFTav2ZpwvCJHVMmzXzo9KjeEpotCBDhW8lB/9 S0Fw9bmJ23/JqqhXGYYaNZNSP4GT6t9LgHNAA3Ef3Pmr1R0vBZjo58IM X-Gm-Gg: Acq92OFdQqGT1DbwuDI/DNPp7OgJNeFONx9l3wYig6iG7XkvDcaapfy/KbYI7+5GUMs b5hFIJ0Pv+wU+koobAbkO1jeGGeqRsz1asQVa/YvwdRngog+CA1J07s3vhZF80GU9bU5wn23yNp IyIh7LsrBMY/Ip0GBQ5BUlC53u6rPtID3LYyWNaA4jyIwScD+zTvz5Yh37szeVNik94ONAnkzts Oz5eNjUsClYrFqinhFqtqT7hYz2x/0LRZCXiL5qCgTlwifrflemJsmUPZqsSV3MhT1SADp4Zy7C PyIuzC0xvqLr+dokZXPwJ4X4vA+fk4pi+iHYCxjvOYCFtgGhXXetqGwL3E7slxDW4sqo8EaU5SC JgfUTHI9gZg1FB95tX1hHLcKFhUM16P8yjm1FeFyUN4Ip99QiJMyofiNogW895535NJp14JuI/m Bb6JsAaXnLlp0lAJH4EEr+gIatPNkjWzAfvIKQbKEH8A== X-Received: by 2002:a17:90b:3907:b0:368:1088:bb1d with SMTP id 98e67ed59e1d1-368f79946bemr1059205a91.15.1778642663670; Tue, 12 May 2026 20:24:23 -0700 (PDT) Received: from [10.125.112.20] ([210.184.73.204]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-c8267735eccsm13751689a12.32.2026.05.12.20.24.16 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 May 2026 20:24:23 -0700 (PDT) Message-ID: <93275e30-ef8b-4ca0-9854-206b4232d90c@gmail.com> Date: Wed, 13 May 2026 11:24:14 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE To: Lorenzo Stoakes , "David Hildenbrand (Arm)" Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, ziy@nvidia.com, baolin.wang@linux.alibaba.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev References: <20260511065701.799006-1-chenwandun@lixiang.com> <4e3d0c1b-af33-470d-acb0-1e6540ba312a@gmail.com> <0a860f74-b2fc-4ea7-9f13-1879c2c8c168@kernel.org> Content-Language: en-US From: Wandun In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 5/12/26 00:21, Lorenzo Stoakes wrote: > On Mon, May 11, 2026 at 10:01:39AM +0200, David Hildenbrand (Arm) wrote: >> On 5/11/26 09:35, Wandun wrote: >>> >>> On 5/11/26 15:17, David Hildenbrand (Arm) wrote: >>>> On 5/11/26 08:57, Wandun Chen wrote: >>>>> From: Chen Wandun >>>>> >>>>> madvise_collapse() computes the THP-aligned window: >>>>> >>>>>      hstart = ALIGN(start, HPAGE_PMD_SIZE);     /* round up */ >>>>>      hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE);    /* round down */ >>>>> >>>>> The following case will cause hstart > hend, and result in underflow >>>>> in the return statement, avoid it by returning -EINVAL early when >>>>> hstart > hend. >>>>> >>>>>      madvise(PMD-aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE); >>>> Ok, so providing a PMD-aligned address as start will result in 0 and a >>>> non-aligned address will result in -EINVAL. >>>> >>>> Didn't Lorenzo agree that just returning 0 in both cases would be clearer? But I >>>> might have misunderstood it. >>> Lorenzo suggested retuern -EINVAL for both case at the beginning, >>> Later, Lorenzo add an correction, suggested should return 0 for >>> compatibilty reasons for hstart == hend case. >>> (If I haven't missed any information) >> Let's wait for Lorenzo's confirmation. > :) thanks. > > See below but TL;DR I convinced myself that actually, I agree with David... > > I hadn't really examined the madvise() <-> madvise_collapse() logic closely > enough but yeah. Return 0 for both cases. > >> I think the important part is that we cannot have a situation where start < end >> (given that madvise() consumes a length). Because, there we really should have >> returned -EINVAL. >> >> For start <= end, if there is nothing suitable to collapse, I'd say we'd just >> consistently return 0. > Right so madvise_vma_behavior() should be called with range->[start, end] tied > to the VMA (under VMA lock we assert this also). > > So what we're really talking about is hstart, hend. > > Really we should NEVER have aligned the addresses for the user, that was the > real mistake here. But that ship has sailed... > > Since we do: > > hstart = ALIGN(start, HPAGE_PMD_SIZE); > hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); > > That means e.g. > > start = + 1 > end = start + > > Results in hstart > hend, so the user must have given incorrect input. > > hstart == hend would be e.g.: > > start = + 1 > end = start + HPAGE_PMD_SIZE > > Which is still an invalid input. I honestly wish we just required that the user > provided PMD aligned ranges and we didn't align for them, it's stupid that we > do. > > So the real question is, what constitutes an actual invalid input here? We've > made a mess and now we have to decide how we interpret it... :) > > I still feel that hstart > hend is an error. Since we are aligning things the > stupid way we do, we are treating start, end as _bounds_. So we are saying 'turn > everything in the bounded range [start, end) into huge pages'. > > So we use ALIGN() on start so nothing BEFORE start gets converted. And we use > ALIGN_DOWN() on end so nothing AFTER end gets converted. > > But if you do: > > (first case above) > > PMD aligned PMD aligned > | <-----------------> | > > You're kinda doing something stupid obviously, because that range cannot span > any huge pages, you don't even cross a boundary, and importantly - your range > _isn't even large enough to include a single page_. > > With: > > PMD aligned PMD aligned > | <------------------------|-> > > You are crossing a boundary but not enough to get a page. But you might well > have a large enough range to span a single page... > > But OTOH... > > PMD aligned PMD aligned > | <---|-> > > Would get you the same as is equally silly. > > Yeah ok this is a long way round of coming to the same conclusion as David, > godamnit, and I so wanted to disagree here :P > > Since both get you nothing, and the input was valid _to madvise()_ let's just > return 0 for both cases. Thanks for taking the time to walk through this, Lorenzo and David. Now we've agreed both cases should consistently return 0, I'll send a v3 that simply bails out with 0 when hstart >= hend. Best regards, Wandun > >> -- >> Cheers, >> >> David > Cheers, Lorenzo