All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Jan Kara <jack@suse.cz>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Михаил Гаврилов" <mikhail.v.gavrilov@gmail.com>,
	linux-xfs@vger.kernel.org, linux-mm@kvack.org
Subject: Re: kernel BUG at fs/xfs/xfs_aops.c:853! in kernel 4.13 rc6
Date: Wed, 6 Sep 2017 09:42:56 +1000	[thread overview]
Message-ID: <20170905234256.GP17782@dastard> (raw)
In-Reply-To: <20170905161734.GA25379@quack2.suse.cz>

On Tue, Sep 05, 2017 at 06:17:34PM +0200, Jan Kara wrote:
> On Tue 05-09-17 08:36:48, Dave Chinner wrote:
> > On Mon, Sep 04, 2017 at 02:14:52PM +0200, Jan Kara wrote:
> > > > Seems like a reasonable revert/change, but given that ext3 was killed
> > > > off long ago, is it even still the case that the mm can feed releasepage
> > > > a dirty clean page?  If that is the case, then isn't it time to fix the
> > > > mm too?
> > > 
> > > Yes, ->releasepage() can still get PageDirty page. Whether the page can or
> > > cannot be reclaimed is still upto filesystem to decide.
> > 
> > Yes, and so we have to handle it.  For all I know right now we could
> > be chasing single bit memory error/corruptions....
> 
> Possibly, although I'm not convinced - as I've mentioned I've seen exact
> same assertion failure in XFS on our SLE12-SP2 kernel (4.4 based) in one of
> customers setup. And I've seen two or three times ext4 barfing for exactly
> same reason - buffers stripped from dirty page.

Yeah, we're chasing ghosts at the moment. :/

[....]
> > > Now XFS shouldn't
> > > really end up freeing such page - either because those delalloc / unwritten
> > > checks trigger or because try_to_free_buffers() refuses to free dirty
> > > buffers.
> > 
> > Except if the dirty page has come through the block_invalidation()
> > path, because all the buffers on the page have been invalidated and
> > cleaned. i.e. we've already removed BH_Dirty, BH_Delay and
> > BH_unwritten from all the buffer heads, so invalidated dirty pages
> > will run right through buffers will be removed.
> > 
> > Every caller to ->releasepage() - except the invalidatepage path and
> > the than the bufferhead stripper - checks PageDirty *after* the
> > ->releasepage call and return without doing anything because they
> > aren't supposed to be releasing dirty pages. So if XFS has decided
> > the page can be released, but a mapping invalidation call then notes
> > the page is dirty, it won't invalidate the pagei but it will have
> > had the bufferheads stripped. That's another possible vector, and
> > one that explicit checking of the page dirty flag will avoid.
> 
> Are you speaking about the PageDirty check in __remove_mapping()? I agree
> that checking PageDirty in releasepage would narrow that window for
> corruption although won't close it completely - there are places in the
> kernel that call set_page_dirty() without page lock held and can thus race
> with page invalidation. But I didn't find how any such callsite could race
> to cause what we are observing...

I was referring to invalidate_complete_page2() - I didn't look down
the __remove_mapping() path after I found the first example in
icp2....

> > Hence my question about XFS being able to cancel the page dirty flag
> > before calling block_invalidation() so that we can untangle the mess
> > where we can't tell the difference between a "must release a dirty
> > invalidated page because we've already invalidated the bufferheads"
> > context and the other "release page only if not dirty" caller
> > context?
> 
> Yeah, I agree that if you add cancel_dirty_page() into
> xfs_vm_invalidatepage() before calling block_invalidatepage() and then bail
> on dirty page in xfs_vm_releasepage(), things should work as well and they
> would be more robust.

Ok, I'll put together a patch to do that. Thanks Jan!

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Jan Kara <jack@suse.cz>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Михаил Гаврилов" <mikhail.v.gavrilov@gmail.com>,
	linux-xfs@vger.kernel.org, linux-mm@kvack.org
Subject: Re: kernel BUG at fs/xfs/xfs_aops.c:853! in kernel 4.13 rc6
Date: Wed, 6 Sep 2017 09:42:56 +1000	[thread overview]
Message-ID: <20170905234256.GP17782@dastard> (raw)
In-Reply-To: <20170905161734.GA25379@quack2.suse.cz>

On Tue, Sep 05, 2017 at 06:17:34PM +0200, Jan Kara wrote:
> On Tue 05-09-17 08:36:48, Dave Chinner wrote:
> > On Mon, Sep 04, 2017 at 02:14:52PM +0200, Jan Kara wrote:
> > > > Seems like a reasonable revert/change, but given that ext3 was killed
> > > > off long ago, is it even still the case that the mm can feed releasepage
> > > > a dirty clean page?  If that is the case, then isn't it time to fix the
> > > > mm too?
> > > 
> > > Yes, ->releasepage() can still get PageDirty page. Whether the page can or
> > > cannot be reclaimed is still upto filesystem to decide.
> > 
> > Yes, and so we have to handle it.  For all I know right now we could
> > be chasing single bit memory error/corruptions....
> 
> Possibly, although I'm not convinced - as I've mentioned I've seen exact
> same assertion failure in XFS on our SLE12-SP2 kernel (4.4 based) in one of
> customers setup. And I've seen two or three times ext4 barfing for exactly
> same reason - buffers stripped from dirty page.

Yeah, we're chasing ghosts at the moment. :/

[....]
> > > Now XFS shouldn't
> > > really end up freeing such page - either because those delalloc / unwritten
> > > checks trigger or because try_to_free_buffers() refuses to free dirty
> > > buffers.
> > 
> > Except if the dirty page has come through the block_invalidation()
> > path, because all the buffers on the page have been invalidated and
> > cleaned. i.e. we've already removed BH_Dirty, BH_Delay and
> > BH_unwritten from all the buffer heads, so invalidated dirty pages
> > will run right through buffers will be removed.
> > 
> > Every caller to ->releasepage() - except the invalidatepage path and
> > the than the bufferhead stripper - checks PageDirty *after* the
> > ->releasepage call and return without doing anything because they
> > aren't supposed to be releasing dirty pages. So if XFS has decided
> > the page can be released, but a mapping invalidation call then notes
> > the page is dirty, it won't invalidate the pagei but it will have
> > had the bufferheads stripped. That's another possible vector, and
> > one that explicit checking of the page dirty flag will avoid.
> 
> Are you speaking about the PageDirty check in __remove_mapping()? I agree
> that checking PageDirty in releasepage would narrow that window for
> corruption although won't close it completely - there are places in the
> kernel that call set_page_dirty() without page lock held and can thus race
> with page invalidation. But I didn't find how any such callsite could race
> to cause what we are observing...

I was referring to invalidate_complete_page2() - I didn't look down
the __remove_mapping() path after I found the first example in
icp2....

> > Hence my question about XFS being able to cancel the page dirty flag
> > before calling block_invalidation() so that we can untangle the mess
> > where we can't tell the difference between a "must release a dirty
> > invalidated page because we've already invalidated the bufferheads"
> > context and the other "release page only if not dirty" caller
> > context?
> 
> Yeah, I agree that if you add cancel_dirty_page() into
> xfs_vm_invalidatepage() before calling block_invalidatepage() and then bail
> on dirty page in xfs_vm_releasepage(), things should work as well and they
> would be more robust.

Ok, I'll put together a patch to do that. Thanks Jan!

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-09-05 23:43 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-03  4:22 kernel BUG at fs/xfs/xfs_aops.c:853! in kernel 4.13 rc6 Михаил Гаврилов
2017-09-03  7:43 ` Christoph Hellwig
2017-09-03  7:43   ` Christoph Hellwig
2017-09-03 14:08   ` Михаил Гаврилов
2017-09-03 14:08     ` Михаил Гаврилов
2017-09-04 12:30     ` Jan Kara
2017-09-04 12:30       ` Jan Kara
2017-10-07  8:10       ` Михаил Гаврилов
2017-10-07  8:10         ` Михаил Гаврилов
2017-10-07  9:22         ` Михаил Гаврилов
2017-10-07  9:22           ` Михаил Гаврилов
2017-10-09  0:05         ` Dave Chinner
2017-10-09  0:05           ` Dave Chinner
2017-10-09 18:31           ` Luis R. Rodriguez
2017-10-09 18:31             ` Luis R. Rodriguez
2017-10-09 19:02             ` Eric W. Biederman
2017-10-09 19:02               ` Eric W. Biederman
2017-10-15  8:53               ` Aleksa Sarai
2017-10-15  8:53                 ` Aleksa Sarai
2017-10-15 13:06                 ` Theodore Ts'o
2017-10-15 13:06                   ` Theodore Ts'o
2017-10-15 22:14                   ` Eric W. Biederman
2017-10-15 22:14                     ` Eric W. Biederman
2017-10-15 23:22                     ` Dave Chinner
2017-10-15 23:22                       ` Dave Chinner
2017-10-16 17:44                       ` Eric W. Biederman
2017-10-16 17:44                         ` Eric W. Biederman
2017-10-16 21:38                         ` Dave Chinner
2017-10-16 21:38                           ` Dave Chinner
2017-10-16  1:13                     ` Theodore Ts'o
2017-10-16  1:13                       ` Theodore Ts'o
2017-10-16 17:53                       ` Eric W. Biederman
2017-10-16 17:53                         ` Eric W. Biederman
2017-10-16 18:50                         ` Theodore Ts'o
2017-10-16 18:50                           ` Theodore Ts'o
2017-10-16 22:00                       ` Dave Chinner
2017-10-16 22:00                         ` Dave Chinner
2017-10-17  1:34                         ` Theodore Ts'o
2017-10-17  1:34                           ` Theodore Ts'o
2017-10-17  0:59                       ` Aleksa Sarai
2017-10-17  0:59                         ` Aleksa Sarai
2017-10-17  9:20                         ` Jan Kara
2017-10-17  9:20                           ` Jan Kara
2017-10-17 14:12                           ` Theodore Ts'o
2017-10-17 14:12                             ` Theodore Ts'o
2017-11-06 19:25                             ` Luis R. Rodriguez
2017-11-06 19:25                               ` Luis R. Rodriguez
2017-11-07 15:26                               ` Jan Kara
2017-11-07 15:26                                 ` Jan Kara
2017-10-09 22:28             ` Dave Chinner
2017-10-09 22:28               ` Dave Chinner
2017-10-10  7:57               ` Jan Kara
2017-10-10  7:57                 ` Jan Kara
2017-09-04  1:43   ` Dave Chinner
2017-09-04  1:43     ` Dave Chinner
2017-09-04  2:20     ` Darrick J. Wong
2017-09-04  2:20       ` Darrick J. Wong
2017-09-04 12:14       ` Jan Kara
2017-09-04 12:14         ` Jan Kara
2017-09-04 22:36         ` Dave Chinner
2017-09-04 22:36           ` Dave Chinner
2017-09-05 16:17           ` Jan Kara
2017-09-05 16:17             ` Jan Kara
2017-09-05 23:42             ` Dave Chinner [this message]
2017-09-05 23:42               ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170905234256.GP17782@dastard \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mikhail.v.gavrilov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.