public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Alex Lyakas <alex@zadarastorage.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Danny Shavit <danny@zadarastorage.com>,
	Shyam Kaushik <shyam@zadarastorage.com>,
	Yair Hershko <yair@zadarastorage.com>,
	xfs@oss.sgi.com
Subject: Re: xfs resize: primary superblock is not updated immediately
Date: Tue, 1 Mar 2016 08:16:28 +1100	[thread overview]
Message-ID: <20160229211628.GK29057@dastard> (raw)
In-Reply-To: <BC0CC25E00CE4CEDA1FFDFC0A2F38742@alyakaslap>

On Mon, Feb 29, 2016 at 11:47:54AM +0200, Alex Lyakas wrote:
> Hello Dave,
> I have tried the same scenario with the 4.5 kernel from about a week
> ago, latest commit being [1].
> The same crash is happening, stack trace being [2].
> 
> I am not proficient with xfstests, unfortunately. I tried running
> them several times, but I am not sure I was doing that properly.
> 
> As for your question why the "block beyond end of the filesystem
> fails". I tried to debug it further and added a print into
> _xfs_buf_find. What happens is that at some point, the sb_dblocks
> value is updated to the new value, but the in-memory pag object is
> not created. So the test:
> 
> eofs = XFS_FSB_TO_BB(btp->bt_mount, btp->bt_mount->m_sb.sb_dblocks);
> if (blkno < 0 || blkno >= eofs) {
> ...
> 
> still holds, but the needed pag does not exist.
> 
> Here are the results of the prints that I added:
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.546542] SGI XFS with
> ACLs, security attributes, realtime, no debug enabled
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.551243] XFS (dm-0):
> Mounting V4 Filesystem
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.576677] XFS (dm-0):
> Starting recovery (logdev: internal)
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.577866]
> _xfs_buf_find: blkno=0 eofs=200704 >m_sb.sb_dblocks=25088
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.577872]
> _xfs_buf_find: blkno=0 eofs=200704 >m_sb.sb_dblocks=25088
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.577882]
> _xfs_buf_find: blkno=0 eofs=200704 >m_sb.sb_dblocks=25088
> ===> Here we start seeing the new value of "sb_dblocks" and hence
> the new value of "eofs":

We shouldn't see sb_dblocks change until log recovery completes the
first phase of log recovery and the in-core superblock is re-read
from disk in xlog_do_recover().

> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.606796]
> _xfs_buf_find: blkno=1 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.606804]
> _xfs_buf_find: blkno=1 eofs=204800 >m_sb.sb_dblocks=25600

looking up AGF 0.

> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.606988]
> _xfs_buf_find: blkno=2 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.606991]
> _xfs_buf_find: blkno=2 eofs=204800 >m_sb.sb_dblocks=25600

Now AGI 0.

> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607205]
> _xfs_buf_find: blkno=50177 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607210]
> _xfs_buf_find: blkno=50177 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607375]
> _xfs_buf_find: blkno=50178 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607378]
> _xfs_buf_find: blkno=50178 eofs=204800 >m_sb.sb_dblocks=25600

AGF/AGI 1.

> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607512]
> _xfs_buf_find: blkno=100353 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607515]
> _xfs_buf_find: blkno=100353 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607652]
> _xfs_buf_find: blkno=100354 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607655]
> _xfs_buf_find: blkno=100354 eofs=204800 >m_sb.sb_dblocks=25600

AGF/AGI 2.

> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607829]
> _xfs_buf_find: blkno=150529 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607832]
> _xfs_buf_find: blkno=150529 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607980]
> _xfs_buf_find: blkno=150530 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.607982]
> _xfs_buf_find: blkno=150530 eofs=204800 >m_sb.sb_dblocks=25600
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.608073]

AGF/AGI 4.

> _xfs_buf_find: blkno=200705 eofs=204800 >m_sb.sb_dblocks=25600
> ===> and here we crash, but as you see, blkno is valid WRT eofs value.
> Feb 29 11:47:09 vc-00-00-350-dev kernel: [   91.608120] BUG: unable
> to handle kernel NULL pointer dereference at 0000000000000098

AGF 5 goes splat.

Which means it's through the first phase of log recovery and it's
not failing in log recovery. i.e. we are now running
xfs_initialize_perag_data() after log recovery. So, as I said a
couple of posts back up this thread:

| If log recovery succeeds, then yes, I can see that there is a
| problem here because the per-ag tree is not reinitialised after
| the superblock is re-read. That's a pretty easy fix, though (3-4
| lines of code in xlog_do_recover() to detect a change in
| filesystem block count and call xfs_initialize_perag() again..

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2016-02-29 21:16 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3685DFAD20214109878873CF81232704@alyakaslap>
2016-02-22 21:20 ` xfs resize: primary superblock is not updated immediately Dave Chinner
2016-02-22 22:38   ` Alex Lyakas
2016-02-22 23:56     ` Dave Chinner
2016-02-23 12:25       ` Alex Lyakas
2016-02-23 22:59         ` Dave Chinner
2016-02-29  9:47           ` Alex Lyakas
2016-02-29 21:16             ` Dave Chinner [this message]
2016-03-01  7:20               ` Dave Chinner
2016-03-02 13:14                 ` Fanael Linithien
2016-03-03  9:18                 ` Alex Lyakas
2016-03-03 21:31                   ` Dave Chinner
2016-03-06  9:46                     ` Alex Lyakas
2016-03-06 15:46                       ` Eric Sandeen
2016-03-06 20:49                       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160229211628.GK29057@dastard \
    --to=david@fromorbit.com \
    --cc=alex@zadarastorage.com \
    --cc=danny@zadarastorage.com \
    --cc=hch@infradead.org \
    --cc=shyam@zadarastorage.com \
    --cc=xfs@oss.sgi.com \
    --cc=yair@zadarastorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox