From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from fout-b2-smtp.messagingengine.com (fout-b2-smtp.messagingengine.com [202.12.124.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F5CA384231; Fri, 6 Mar 2026 14:13:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=202.12.124.145 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772806419; cv=none; b=AZKNF4D53EyH/kV3DNVgQaADsEn7fSifR3ZWtUmVLLGyC7cttYolGLDg1YVLzIfHQZVZ2RGpgzadfvRcpcEZQllb4FBV3HLiBpJtNRy4+L1LTneKl07stu4aMyS4NkyBTDUkzDETodosGmEB8DSTY3XYc0I9rmHwnvQmVowK2ZM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772806419; c=relaxed/simple; bh=+/EWe3bgTQkUZhEU4SZyj/X7e11mhiDwwdBzt5mlP1w=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=XtJ4hD0zm3EZ88/8hdhRibAj5xy9jlv/843Bpsee+rReBvYjISKnykjVN2TMiREnQ1t8HmS2QrrLj0bbDAUoIJ5OymhM9h1uWuKdBSyIwp2hrZFxfYtgLm2IxwUWCqpxhnkRCEQbCx1qS5GD/Zul50H3FC/Cbpi/7HNuln24c4M= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name; spf=pass smtp.mailfrom=shutemov.name; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b=fxZZ+mR7; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=X0+9sAXX; arc=none smtp.client-ip=202.12.124.145 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shutemov.name Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b="fxZZ+mR7"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="X0+9sAXX" Received: from phl-compute-05.internal (phl-compute-05.internal [10.202.2.45]) by mailfout.stl.internal (Postfix) with ESMTP id 21F821D000E1; Fri, 6 Mar 2026 09:13:36 -0500 (EST) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-05.internal (MEProxy); Fri, 06 Mar 2026 09:13:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm2; t=1772806415; x= 1772892815; bh=lzYG0S3OLle72IwqTl9xNJv1EjXZoWsv+cwzJ5nwMNA=; b=f xZZ+mR7+LcRMOzk3UUz4+WotycEzT2UaGotDtZ+4efi4P0Ch0jKMCZZiKTXHQ1ty ZJQ3yYTbCdKYXKPFZlglxSGgpFTt4BkqSvqBqj1GF0Q5LVyJ251Xb9C4aHQdfAkd X45YrBxPHWlEtWIpxwCCeIL0TwUbs0IfrsaoDrUXhmkbHwPFgsm9bQoeUpOvJ8lR mbwLct1CRvzt9KHsBGx0XeGA/v0pQhTVQHIezeVNip3BnCl3lguREW0cizMCtpFH jWDyZNa8aJQ2xxfO+gh2+I3mZ7BiM1VO6hNLYzzCR6sWex5YFPALHA+PLvVTfsqQ lrzgVyBuI4wFerLWYkTcA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1772806415; x=1772892815; bh=lzYG0S3OLle72IwqTl9xNJv1EjXZoWsv+cw zJ5nwMNA=; b=X0+9sAXXNjHfZDB+rgGnt7J2/aV+YF9oV0ESEZx5jmuFOUjElzr 3yBaQQh4kLaeYUu4d5O+i5Eoy6TUBpVmBC82y08x/uwK0y//pMQmlY9MuO1Hnctz pxhOInFFjdHZczYmOi16eLjj21KqJrdTGDa8lrCOSA8hczwqv/xZNbCCFyG6eion JJaKs4Jsy8jMWFcNCuwF/hn91/TD3jOEUEvHtpjpUltpKTE31oMfzdl0PBOm4mgF kS62N5T5CdSLa4U1zN7XhvubX4X8iYPY/KZjYlGBPPduaIOQ/Jg0ZlnIX82YE9Pv sID+8ZEembboc7yKsDluwIy5bpLSgTcXtdQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddvieelhedtucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkfhggtggujgesthdtredttddtvdenucfhrhhomhepmfhirhihlhcu ufhhuhhtshgvmhgruhcuoehkihhrihhllhesshhhuhhtvghmohhvrdhnrghmvgeqnecugg ftrfgrthhtvghrnhepfeetheejudeujeeikeetudelvdevkeefuddtkedvtdehtdetieeu ieetjeeugedtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrh homhepkhhirhhilhhlsehshhhuthgvmhhovhdrnhgrmhgvpdhnsggprhgtphhtthhopedu iedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepfihilhhlhiesihhnfhhrrgguvg grugdrohhrghdprhgtphhtthhopegtrghrghgvshestghlohhuughflhgrrhgvrdgtohhm pdhrtghpthhtoheprghkphhmsehlihhnuhigqdhfohhunhgurghtihhonhdrohhrghdprh gtphhtthhopeifihhllhhirghmrdhkuhgthhgrrhhskhhisehorhgrtghlvgdrtghomhdp rhgtphhtthhopehlihhnuhigqdhfshguvghvvghlsehvghgvrhdrkhgvrhhnvghlrdhorh hgpdhrtghpthhtoheplhhinhhugidqmhhmsehkvhgrtghkrdhorhhgpdhrtghpthhtohep lhhinhhugidqkhgvrhhnvghlsehvghgvrhdrkhgvrhhnvghlrdhorhhgpdhrtghpthhtoh epkhgvrhhnvghlqdhtvggrmhestghlohhuughflhgrrhgvrdgtohhm X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 6 Mar 2026 09:13:33 -0500 (EST) Date: Fri, 6 Mar 2026 14:13:26 +0000 From: Kiryl Shutsemau To: Matthew Wilcox Cc: Chris J Arges , akpm@linux-foundation.org, william.kucharski@oracle.com, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@cloudflare.com Subject: Re: [PATCH RFC 1/1] mm/filemap: handle large folio split race in page cache lookups Message-ID: References: <20260305183438.1062312-1-carges@cloudflare.com> <20260305183438.1062312-2-carges@cloudflare.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Thu, Mar 05, 2026 at 07:24:38PM +0000, Matthew Wilcox wrote: > On Thu, Mar 05, 2026 at 12:34:33PM -0600, Chris J Arges wrote: > > We have been hitting VM_BUG_ON_FOLIO(!folio_contains(folio, index)) in > > production environments. These machines are using XFS with large folio > > support enabled and are under high memory pressure. > > > > >From reading the code it seems plausible that folio splits due to memory > > reclaim are racing with filemap_fault() serving mmap page faults. > > > > The existing code checks for truncation (folio->mapping != mapping) and > > retries, but there does not appear to be equivalent handling for the > > split case. The result is: > > > > kernel BUG at mm/filemap.c:3519! > > VM_BUG_ON_FOLIO(!folio_contains(folio, index), folio) > > This didn't occur to me as a possibility because filemap_get_entry() > is _supposed_ to take care of it. But if this patch fixes it, then > we need to understand why it works. > > folio_split() needs to be sure that it's the only one holding a reference > to the folio. To that end, it calculates the expected refcount of the > folio, and freezes it (sets the refcount to 0 if the refcount is the > expected value). Once filemap_get_entry() has incremented the refcount, > freezing will fail. > > But of course, we can race. filemap_get_entry() can load a folio first, > the entire folio_split can happen, then it calls folio_try_get() and > succeeds, but it no longer covers the index we were looking for. That's > what the xas_reload() is trying to prevent -- if the index is for a > folio which has changed, then the xas_reload() should come back with a > different folio and we goto repeat. > > So how did we get through this with a reference to the wrong folio? What would xas_reload() return if we raced with split and index pointed to a tail page before the split? Wouldn't it return the folio that was a head and check will pass? -- Kiryl Shutsemau / Kirill A. Shutemov