From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-174.mta1.migadu.com (out-174.mta1.migadu.com [95.215.58.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9D1ED3ACF02 for ; Tue, 23 Jun 2026 21:23:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.174 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782249822; cv=none; b=CMzShr30bsHhO5V1g3jIy4ffmDfVMadvfNKZtxNWQFOEfFNY0YD4pJteL+IQnvGBKFfHeL6eTM9RUiBTmRhQaZdB7uURu0Ch0jFBE4lo1nsZkcTOSmePbxXREJCvRm3BG6bJri/B1/wGLAPlwwfpSVbvYxzl3G4sidnsFpgRTqM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782249822; c=relaxed/simple; bh=BAD+xJyArpsuN+ACOox6DAGrehRBAky0saTK/Vs9NLg=; h=MIME-Version:Date:Content-Type:From:Message-ID:Subject:To:Cc: In-Reply-To:References; b=W47IWWjgu4Inq+wJjMwRNdlusZaPXpCh91TsR6f4hWgZkNYK7JYaLM50FXhVWLTV+7S000yC7Sr+XRhRaUDKOtxA229ZowdSBj2hcQAMrMxXHFp9JcxoZnOnTenU30PjnjQVVjaOVsOC6jHB/DIsFnA/+6HGNXClHyGLLumGq4U= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=Qx9Lc2W/; arc=none smtp.client-ip=95.215.58.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="Qx9Lc2W/" Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1782249818; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NbQMjijENoTwEtXKCq9SMLmcNPckarSV38WOLjqdRT0=; b=Qx9Lc2W/LK1IuYN7D+Z3isSkqKf5m9WtBOL6EmqBC18yWkr0CfJ05cIEFEgnhnOl4RS5eP +E+hOV72cwN8LISB80FRlOKNG8kkI314rx3YGKIscxGF0KtoQh3X4/AALNEVFLlp/q63jt WzVQU6sfUK5iDIPJiI1Jj56nzj7lrEs= Date: Tue, 23 Jun 2026 21:23:32 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "Ilya Gladyshev" Message-ID: TLS-Required: No Subject: Re: [PATCH v4 0/2] mm: improve folio refcount scalability To: "Matthew Wilcox" Cc: "Linus Torvalds" , "Andrew Morton" , ivgorbunov@me.com, Liam.Howlett@oracle.com, apopple@nvidia.com, artem.kuzin@huawei.com, baolin.wang@linux.alibaba.com, david@kernel.org, foxido@foxido.dev, harry.yoo@oracle.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lorenzo.stoakes@oracle.com, mhocko@suse.com, muchun.song@linux.dev, rppt@kernel.org, surenb@google.com, vbabka@suse.cz, yuzhao@google.com, ziy@nvidia.com, pfalcato@suse.de, kirill@shutemov.name In-Reply-To: References: <20260608154734.8e4115fde4e2e14a3b6892fb@linux-foundation.org> <839a2ea2755fdddf5af773e006237b07c9e261df@linux.dev> X-Migadu-Flow: FLOW_OUT >=20 >=20On Sun, Jun 21, 2026 at 09:34:47PM +0000, Gladyshev Ilya wrote: >=20 >=20>=20 >=20> June 21, 2026 at 7:46 AM, Linus Torvalds wrote: > >=20=20 >=20> On Sat, 20 Jun 2026 at 11:19, wrote: > >=20=20 >=20> >=20 >=20> > T2: optimistic get() [0 -> 1] > > > T2: put page back [1 -> 0] > > > T2: calls dtor for type X, returns into the allocator > > >=20 >=20> Which optimistic getter does this? > >=20=20 >=20> If I understood you correctly, you are talking about the scenario = where > > an optimistic getter took a refcount on the stolen page, so the vali= dity > > check in the XArray will fail. And this scenario does indeed work no= rmally. > >=20=20 >=20> This "ABA" happens if the optimistic getter successfully gets a re= fcount > > on a valid page, so the full T2 execution looks like this: > >=20=20 >=20> T2: optimistic get() [0 -> 1] > > T2: re-checks page [OK] > >=20 >=20I don't think that can happen. Or maybe it can and we need to add > some barriers. The page is always removed from visibility (whether > we're talking about a page cache lookup or a page table lookup), then > the refcount is decremented. I hope we have enough barriers in place > to ensure that the refcount decrement is observed after the removal of > the PTE entry or the XArray entry. >=20 >=20But I'm not sure why the folio_put() after a speculative get avoids t= his > problem; why do we need the recheck to be successful to hit this race? After additional thought, this race doesn't require speculative gets at a= ll and is more about two parallel `folio_puts()`: one successfully deallocat= es the page, and one sleeps for a long time and then calls __folio_put() on = a "logically new" page. And as everybody pointed out, this race isn't something specific to this = patch. So folio_put() was always ready to be called on non-folio objects due to optimistic try_get() + put() in the filecache. We only care about calling folio_put() once, and nothing breaks with this patch. So thanks to you, Linus, and David, for helping to clarify this :) I'll p= ost v5 with more or less cosmetic fixes from the AI review then.