From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EF2CC7EE24 for ; Mon, 5 Jun 2023 16:16:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231146AbjFEQQq (ORCPT ); Mon, 5 Jun 2023 12:16:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229878AbjFEQQn (ORCPT ); Mon, 5 Jun 2023 12:16:43 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C8B4E6; Mon, 5 Jun 2023 09:16:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=iB0H24zQ9afWpiMzWIwA6kDQKADXkA7aEQwZd0XdpsA=; b=m1O3EZaCKVFyHwnXGLY+agmDGy uLRg+gJBkoUMCaBPEWqnkoI8H0gdXoBfkXZRKoyMpiQS67KtZwDowayW/o7j1aNb77kgw2JW1IjdN kAwqOAbTNOdxFB5dNwXUNQG/iG+M3bPEDkDDom38C+jCReOe29+2GeQpBkMuZOKZBWCDUqrqXb+F3 55o4w4UTsNbGMvuVdJLE4EDi1BFYfnDlMLyxIu6oFzhkff6keZwHy//GUxRJryauhBnw4Y+pBMwnD X356IEnJizGQOa7E11kKIo4aeYHwsjyy3whFG0xEqt3cK9OnJJMFfWd1aNOAyNZBqoXF5+JK1pUop UaZ4spnA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1q6Csq-00CBDa-6x; Mon, 05 Jun 2023 16:16:40 +0000 Date: Mon, 5 Jun 2023 17:16:40 +0100 From: Matthew Wilcox To: Dave Chinner Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: DIO hangs in 6.4.0-rc2 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Wed, May 17, 2023 at 08:09:35AM +1000, Dave Chinner wrote: > On Tue, May 16, 2023 at 01:28:00PM +0100, Matthew Wilcox wrote: > > Plain 6.4.0-rc2 with a relatively minor change to the futex code that > > I cannot believe was in any way responsible for this. > > > > kworkers blocked all over the place. Some on XFS_ILOCK_EXCL. Some on > > xfs_buf_lock. One in xfs_btree_split() calling wait_for_completion. > > > > This was an overnight test run that is now dead, so I can't get any > > more info from the locked up kernel. I have the vmlinux if some > > decoding of offsets is useful. > > This is likely the same AGF try-lock bug that was discovered in this > thread: > > https://lore.kernel.org/linux-xfs/202305090905.aff4e0e6-oliver.sang@intel.com/ > > The fact that the try-lock was ignored means that out of order AGF > locking can be attempted, and the try-lock prevents deadlocks from > occurring. > > Can you try the patch below - I was going to send it for review > anyway this morning so it can't hurt to see if it also fixes this > issue. I still have this patch in my tree and it's not in rc5. Was this problem fixed some other way, or does it still need to land upstream? I don't see any changes to XFS since May 11th's pull request.