linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Eryu Guan <eguan@redhat.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [BUG 4.10-rc7] sb_fdblocks inconsistency in xfs/297 test
Date: Sat, 11 Feb 2017 07:33:08 +0100	[thread overview]
Message-ID: <20170211063308.GA30713@dhcp22.suse.cz> (raw)
In-Reply-To: <20170211060204.GA24562@eguan.usersys.redhat.com>

On Sat 11-02-17 14:02:04, Eryu Guan wrote:
> On Fri, Feb 10, 2017 at 10:31:31AM +0100, Michal Hocko wrote:
> > [CC Christoph]
> > 
> > On Fri 10-02-17 09:02:10, Michal Hocko wrote:
> > > On Fri 10-02-17 08:14:18, Michal Hocko wrote:
> > > > On Fri 10-02-17 11:53:48, Eryu Guan wrote:
> > > > > Hi,
> > > > > 
> > > > > I was testing 4.10-rc7 kernel and noticed that xfs_repair reported XFS
> > > > > corruption after fstests xfs/297 test. This didn't happen with 4.10-rc6
> > > > > kernel, and git bisect pointed the first bad commit to
> > > > > 
> > > > > commit d1908f52557b3230fbd63c0429f3b4b748bf2b6d
> > > > > Author: Michal Hocko <mhocko@suse.com>
> > > > > Date:   Fri Feb 3 13:13:26 2017 -0800
> > > > > 
> > > > >     fs: break out of iomap_file_buffered_write on fatal signals
> > > > > 
> > > > >     Tetsuo has noticed that an OOM stress test which performs large write
> > > > >     requests can cause the full memory reserves depletion.  He has tracked
> > > > >     this down to the following path
> > > > > ....
> > > > > 
> > > > > It's the sb_fdblocks field reports inconsistency:
> > > > > ...
> > > > > Phase 2 - using internal log   
> > > > >         - zero log...
> > > > >         - scan filesystem freespace and inode maps...
> > > > > sb_fdblocks 3367765, counted 3367863
> > > > >         - 11:37:41: scanning filesystem freespace - 16 of 16 allocation groups done
> > > > >         - found root inode chunk
> > > > > ...
> > > > > 
> > > > > And it can be reproduced almost 100% with all XFS test configurations
> > > > > (e.g. xfs_4k xfs_2k_reflink), on all test hosts I tried (so I didn't
> > > > > bother pasting my detailed test and host configs, if more info is needed
> > > > > please let me know).
> > > > 
> > > > The patch can lead to short writes when the task is killed. Was there
> > > > any OOM killer triggered during the test? If not who is killing the
> > > > task? I will try to reproduce later today.
> > > 
> > > I have checked both tests and they are killing the test but none of them
> > > seems to be using SIGKILL. The patch should make a difference only for
> > > fatal signal (aka SIGKILL). Is there any other part that can do SIGKILL
> > > except for the OOM killer?
> 
> No, I'm not aware of any other part in fstests harness could send
> SIGKILL.

hmm, maybe this is a result of the group_exit which sends SIGKILL to
other threads (zap_other_threads)
 
[...]
> > So somebody had to send SIGKILL to fsstress. Anyway, I am wondering
> > whether this is really a regression. xfs_file_buffered_aio_write used to
> > call generic_perform_write which does the same thing.
> 
> Maybe it just uncovered some existing bug?

maybe

> Anyway, a reliable reproduced filesystem metadata inconsistency does
> smell like a bug.

definitely! Unfortunately I am going to disappear for week. Will be back
on 20th. Anyway, I believe iomap_file_buffered_write and its callers
_should_ be able to handle short reads. EINTR is not the only way how
can this happen. ENOMEM would be another.

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2017-02-11  6:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-10  3:53 [BUG 4.10-rc7] sb_fdblocks inconsistency in xfs/297 test Eryu Guan
2017-02-10  4:20 ` Eryu Guan
2017-02-10  7:14 ` Michal Hocko
2017-02-10  8:02   ` Michal Hocko
2017-02-10  9:31     ` Michal Hocko
2017-02-11  6:02       ` Eryu Guan
2017-02-11  6:33         ` Michal Hocko [this message]
2017-02-20 14:25       ` Michal Hocko
2017-02-20 14:58         ` Brian Foster
2017-02-21  4:14           ` Eryu Guan
2017-02-21  8:13             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170211063308.GA30713@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=eguan@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).