From mboxrd@z Thu Jan 1 00:00:00 1970 From: ChunYu Wang Subject: Re: kernel BUG at fs/ext4/fsync.c:LINE! Date: Fri, 15 Sep 2017 02:22:01 +0800 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: GeneBlue , "Theodore Ts'o" , linux-ext4 , LKML , syzkaller To: Andreas Dilger Return-path: Received: from mail-qt0-f173.google.com ([209.85.216.173]:53771 "EHLO mail-qt0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751316AbdINSWC (ORCPT ); Thu, 14 Sep 2017 14:22:02 -0400 Received: by mail-qt0-f173.google.com with SMTP id 47so163331qts.10 for ; Thu, 14 Sep 2017 11:22:02 -0700 (PDT) In-Reply-To: Sender: linux-ext4-owner@vger.kernel.org List-ID: On Fri, Sep 15, 2017 at 12:41 AM, Andreas Dilger wrote: > I don't think a reproducer is needed. It looks like the fsync callpath > is happening from an IRQ context due to IO completion, and then re-entering > the filesystem while a transaction is already started. It looks like the > original IO was submitted with AIO based on the functions on the IRQ stack, > which is likely why nobody has hit it (AIO isn't very commonly used). > > That said, I don't follow the reasoning behind the convoluted series of AIO > callbacks that has IO _completion_ calling vfs_fsync_range() and re-entering > the filesystem to flush out more data? Thanks for analyzing, and I do think the syzkaller reproducer(in fact, log) may also answer your question and help positioning the precise issue trigger in-depth. Moreover, for me, I am not professional enough to analyze such a complex problem with call trace and code only :) - ChunYu