From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de ([195.135.220.15]:60317 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750738AbdBKGfe (ORCPT ); Sat, 11 Feb 2017 01:35:34 -0500 Date: Sat, 11 Feb 2017 07:33:08 +0100 From: Michal Hocko To: Eryu Guan Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Christoph Hellwig Subject: Re: [BUG 4.10-rc7] sb_fdblocks inconsistency in xfs/297 test Message-ID: <20170211063308.GA30713@dhcp22.suse.cz> References: <20170210035348.GA7075@eguan.usersys.redhat.com> <20170210071418.GC9346@dhcp22.suse.cz> <20170210080210.GC10893@dhcp22.suse.cz> <20170210093131.GH10893@dhcp22.suse.cz> <20170211060204.GA24562@eguan.usersys.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170211060204.GA24562@eguan.usersys.redhat.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sat 11-02-17 14:02:04, Eryu Guan wrote: > On Fri, Feb 10, 2017 at 10:31:31AM +0100, Michal Hocko wrote: > > [CC Christoph] > > > > On Fri 10-02-17 09:02:10, Michal Hocko wrote: > > > On Fri 10-02-17 08:14:18, Michal Hocko wrote: > > > > On Fri 10-02-17 11:53:48, Eryu Guan wrote: > > > > > Hi, > > > > > > > > > > I was testing 4.10-rc7 kernel and noticed that xfs_repair reported XFS > > > > > corruption after fstests xfs/297 test. This didn't happen with 4.10-rc6 > > > > > kernel, and git bisect pointed the first bad commit to > > > > > > > > > > commit d1908f52557b3230fbd63c0429f3b4b748bf2b6d > > > > > Author: Michal Hocko > > > > > Date: Fri Feb 3 13:13:26 2017 -0800 > > > > > > > > > > fs: break out of iomap_file_buffered_write on fatal signals > > > > > > > > > > Tetsuo has noticed that an OOM stress test which performs large write > > > > > requests can cause the full memory reserves depletion. He has tracked > > > > > this down to the following path > > > > > .... > > > > > > > > > > It's the sb_fdblocks field reports inconsistency: > > > > > ... > > > > > Phase 2 - using internal log > > > > > - zero log... > > > > > - scan filesystem freespace and inode maps... > > > > > sb_fdblocks 3367765, counted 3367863 > > > > > - 11:37:41: scanning filesystem freespace - 16 of 16 allocation groups done > > > > > - found root inode chunk > > > > > ... > > > > > > > > > > And it can be reproduced almost 100% with all XFS test configurations > > > > > (e.g. xfs_4k xfs_2k_reflink), on all test hosts I tried (so I didn't > > > > > bother pasting my detailed test and host configs, if more info is needed > > > > > please let me know). > > > > > > > > The patch can lead to short writes when the task is killed. Was there > > > > any OOM killer triggered during the test? If not who is killing the > > > > task? I will try to reproduce later today. > > > > > > I have checked both tests and they are killing the test but none of them > > > seems to be using SIGKILL. The patch should make a difference only for > > > fatal signal (aka SIGKILL). Is there any other part that can do SIGKILL > > > except for the OOM killer? > > No, I'm not aware of any other part in fstests harness could send > SIGKILL. hmm, maybe this is a result of the group_exit which sends SIGKILL to other threads (zap_other_threads) [...] > > So somebody had to send SIGKILL to fsstress. Anyway, I am wondering > > whether this is really a regression. xfs_file_buffered_aio_write used to > > call generic_perform_write which does the same thing. > > Maybe it just uncovered some existing bug? maybe > Anyway, a reliable reproduced filesystem metadata inconsistency does > smell like a bug. definitely! Unfortunately I am going to disappear for week. Will be back on 20th. Anyway, I believe iomap_file_buffered_write and its callers _should_ be able to handle short reads. EINTR is not the only way how can this happen. ENOMEM would be another. -- Michal Hocko SUSE Labs