From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8ADE8341065 for ; Thu, 19 Mar 2026 09:14:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.189 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773911664; cv=none; b=qEI0htLGOasJTz1a45KOI0iTvvHNpC9CGAqABqS8/J1PTjTCFLSXc1nyMrlRR891LYGvlOHfIi1fgs0BrHRz+SIogDDFRmf0yTfydieluOFLUjXGNrn/P3xCGn1sO8CA3STym+UBTajCA+HlhbOUaFPd6KtNjMQQqN2d0W0N2lg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773911664; c=relaxed/simple; bh=P2U1wlSjWNG+HSNqSZPH3D3pdLsGnYntgCPIZoZulC0=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=Sjf0KIT73gyKUTZ9Laxu0a5EqbTFvWTHTKnTbY71nNvu7Ai1WHU/JuG6lCxIR1Mkl/Lte3u4+mF84WNvPxE0+cJu9ACLLoxPnCpbkLPzHo2BsP/hHQXU1vgabNTTLu8oozXGFYp8n4JD1lXuX7vke+Si9LYbf3RBKtXp/Zk/wpU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=E8KC75sI; arc=none smtp.client-ip=91.218.175.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="E8KC75sI" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1773911659; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gYXkNav1+HQX/FrEJ8HsLP07xF9bB7aK4BPpIVJxRmg=; b=E8KC75sIn/igZd6wRapbByk/rIhF7KrGPQPN9fafPCaBRxPbu3ogIttiA+zoTIx+GELYXF USlQ35JZjtQYZCRexz7VHtTpRufNnX7u9/o/1WSXAUYzgLFcTdCmKmKUiFM7/LOmaWKnMl tpXn4ok3Bi+POS7vZBoGQOhjnUXA2EI= Date: Thu, 19 Mar 2026 17:14:06 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [syzbot] [mm?] kernel BUG in collapse_scan_file Content-Language: en-US To: "David Hildenbrand (Arm)" , "Lorenzo Stoakes (Oracle)" Cc: syzbot , willy@infradead.org, baolin.wang@linux.alibaba.com, npache@redhat.com, linux-mm@kvack.org, baohua@kernel.org, ryan.roberts@arm.com, syzkaller-bugs@googlegroups.com, dev.jain@arm.com, ziy@nvidia.com, linux-kernel@vger.kernel.org, Liam.Howlett@oracle.com, akpm@linux-foundation.org References: <69bba3c0.050a0220.227207.002b.GAE@google.com> <10e5f1d6-077d-4783-aa16-6c8b98cb9e74@lucifer.local> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Lance Yang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 2026/3/19 17:00, David Hildenbrand (Arm) wrote: > On 3/19/26 09:53, Lorenzo Stoakes (Oracle) wrote: >> On Thu, Mar 19, 2026 at 04:05:38PM +0800, Lance Yang wrote: >>> Ccing Willy >>> >>> IIUC, this is a dup of the earlier report[1], which I looked into back >>> in January. The root cause is the same: collapse_file() calls >>> xas_lock_irq() without resetting the xas state first, tripping the >>> XAS_INVALID() assertion: >>> >>> #define xas_lock_irq(xas) xa_lock_irq(XAS_INVALID(xas)->xa) >>> >>> static inline struct xa_state *XAS_INVALID(struct xa_state *xas) >>> { >>> XA_NODE_BUG_ON(xas->xa_node, xas_valid(xas)); >>> return xas; >>> } >>> >>> Added by commit: >>> >>> commit 43b00759f21b10142094d1ae5ff65cbb368953a3 >>> Author: Matthew Wilcox (Oracle) >>> Date: Sun Dec 14 10:53:31 2025 -0500 >>> >>> XArray: Add extra debugging check to xas_lock and friends >>> >>> While tracking down a recent bug, we discovered somewhere that had >>> forgotten to call xas_reset() before calling xas_lock(). Add a debug >>> check to be sure that doesn't happen in future and fix all the places in >>> the test suite which were carelessly doing just this. >>> >>> Suggested-by: Linus Torvalds >>> Signed-off-by: Matthew Wilcox (Oracle) >>> >>> I posted a HACK fix at the time[2], but David pointed out that Willy >>> had mentioned it likely needs more thought[3]. >> >> Hmm we shouldn't leave this bug in place while working for a fancier fix?? >> >> Can we get _something_ going as an upstream fix? We can improve whatever we do >> later right? >> >> David, thoughts? > > I recall Willy mentioning that the issue is likely a false positive. > > IIUC, that commit is not upstream? So it only triggers in linux-next. Right. That does not appear to be in upstream, I only see it in linux-next :) > Which means: > > 1) If it's a false positive, upstream is not effected (no XA_NODE_BUG_ON) > > 2) If it's not a false positive, upstream is effected but does not > trigger the XA_NODE_BUG_ON Yep. So this particular BUG_ON is not affecting upstream directly. That said, syzbot will likely keep hitting it in linux-next and generating noise for us until it is addressed there ...