Re: 2.6.24-rc3 oopses while mounting fs

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Lachlan McIlroy <lachlan@sgi.com>
To: Timothy Shimmin <tes@sgi.com>
Cc: xfs@oss.sgi.com
Subject: Re: 2.6.24-rc3 oopses while mounting fs
Date: Wed, 05 Dec 2007 18:04:33 +1100	[thread overview]
Message-ID: <47564D81.2040204@sgi.com> (raw)
In-Reply-To: <47564761.9070805@sgi.com>

Thanks Tim.

Timothy Shimmin wrote:
> Lachlan,
> 
> An aside...
> 
> I think I see why we couldn't get "xfs_logprint -t" to work on a file (-f).
>   > xfs_logprint -t -f ~/debug/sdc.xlg
>   xfs_logprint:
>       data device: 0xffffffffffffffff
>       log device: 0xffffffffffffffff daddr: 0 length: 262144
> 
>   XFS: Log inconsistent (didn't find previous header)
>   XFS: failed to find log head
>   xfs_logprint: failed to find head and tail, error: 5
> 
> The problem is that it is using the superblock to determine if it is 
> version 1
> or version 2 logs in order to find out the maximum size of the iclog 
> (log record).
> And for -f there is no superblock to look at. The version number is, 
> however, in the
> log record header.
> So when it looks in the undefined data it comes up with _not_ version 2,
> and so assumes to look back for the hdr to only 32K when it really needs
> to look back to 64k (for v2 it would limit at 256K).
> Hence, it can't find the magic# in the header.
> 
> The debugging...
> 
> ====================================================
> Breakpoint 1, xlog_find_verify_log_record (log=0x607ffffffea96be0, 
> start_blk=5312, last_blk=0x607ffffffea96bc0, extra_bblks=0)
>     at xfs_log_recover.c:234
> 234             if (!(bp = xlog_get_bp(log, num_blks))) {
> (gdb) bt
> #0  xlog_find_verify_log_record (log=0x607ffffffea96be0, start_blk=5312, 
> last_blk=0x607ffffffea96bc0, extra_bblks=0) at xfs_log_recover.c:234
> #1  0x4000000000092020 in xlog_find_head (log=0x607ffffffea96be0, 
> return_head_blk=0x607ffffffea96bd0) at xfs_log_recover.c:527
> #2  0x4000000000094000 in xlog_find_tail (log=0x607ffffffea96be0, 
> head_blk=0x607ffffffea96bd0, tail_blk=0x607ffffffea96bd8, readonly=0)
>     at xfs_log_recover.c:616
> #3  0x40000000000100a0 in xfs_log_print_trans (log=0x607ffffffea96be0, 
> print_block_start=-1) at log_print_trans.c:49
> #4  0x4000000000003600 in main (argc=<value optimized out>, argv=<value 
> optimized out>) at logprint.c:240
> 
> (gdb) print *last_blk
> $1 = 5376
> 
> start_blk = 5312
> last_blk = 5376
> extra_bblks = 0
> 
> Goes from 5376..5312 looking for magic# at start of sector.
> Errors out if can't find one.
> Which is true since the next magic# is at 5248.
> 
> logprint -d output:
>   4992 HEADER Cycle 154 tail 153:253056 len  61440 ops 105
>   5120 HEADER Cycle 154 tail 153:253056 len  61440 ops 344
>   5248 HEADER Cycle 154 tail 153:253056 len  61440 ops 303  <--- 
> previous block
>   5376 HEADER Cycle 153 tail 153:253056 len      0 ops 0    <--- hdr
> [00000 - 05376] Cycle 0x0000009a New Cycle 0x00000099
>   5377 HEADER Cycle 153 tail 153:253056 len      0 ops 0
>   5378 HEADER Cycle 153 tail 153:253056 len      0 ops 0
> 
> So why is it using a start_blk of 5312?
> 
> 
>     522         num_scan_bblks = XLOG_REC_SHIFT(log);
>     523         if (head_blk >= num_scan_bblks) {
>     524                 start_blk = head_blk - num_scan_bblks; /* don't 
> read head_blk */
>     525
>     526                 /* start ptr at last block ptr before head_blk */
>  ->527                 if ((error = xlog_find_verify_log_record(log, 
> start_blk,
>     528                                                         
> &head_blk, 0)) == -1) {
> 
> #define XLOG_REC_SHIFT(log) \
>         BTOBB(1 << (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) ? \
>          XLOG_MAX_RECORD_BSHIFT : XLOG_BIG_RECORD_BSHIFT))
> 
> #define XLOG_BIG_RECORD_BSHIFT  15              /* 32k == 1 << 15 */
> #define XLOG_MAX_RECORD_BSHIFT  18              /* 256k == 1 << 18 */
> 
> Number of blocks between rec-hdrs from -d output:
> 5376-5248 = 128 BB
> 
> Looks like it used 5312 = 5376 - num_scan_bblks
> => num_scan_bblks = 64
> 
> Yep
> (gdb) p num_scan_bblks
> $3 = 64
> 32K
> i.e. it thinks it has v1 logs
> 
> (gdb) p *(log->l_mp)
> $9 = {m_sb = {sb_magicnum = 0, sb_blocksize = 0, sb_dblocks = 0, 
> sb_rblocks = 0, sb_rextents = 0, sb_uuid = '\0' <repeats 15 times>,
>     sb_logstart = 2305843009213997776, sb_rootino = 6953557824646229696, 
> sb_rbmino = 6953557824646229688, sb_rsumino = 6953557824646229680,
>     sb_rextsize = 284472, sb_agblocks = 536870912, sb_agcount = 
> 4281151176, sb_rbmblocks = 1619001343, sb_logblocks = 0, sb_versionnum = 0,
>     sb_sectsize = 0, sb_inodesize = 38256, sb_inopblock = 4, sb_fname = 
> "\000\000\000 H?\a\000\000\000\000 ", sb_blocklog = 0 '\0',
> etc...
> it is not set
> 
> Okay, need to file a bug for this part so that -f can be useful here.
> It needs to be using a LR h_version number for this somehow.
> 
> --Tim
> 
> 
> Lachlan McIlroy wrote:
>>> ============================================================================ 
>>>
>>> cycle: 154    version: 2        lsn: 154,5248    tail_lsn: 153,253056
>>> length of Log Record: 61440    prev offset: 5120        num ops: 303
>>> uuid: b02fc6bd-662f-40a5-8f0c-e260a881a3e7   format: little endian linux
>>> h_size: 262144
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
>>>
>>> extended-header: cycle: 154
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
>>>
>>> extended-header: cycle: 154
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
>>>
>>> extended-header: cycle: 154
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
>>>
>>> extended-header: cycle: 154
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
>>>
>>> extended-header: cycle: 154
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
>>>
>>> extended-header: cycle: 154
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
>>>
>>> extended-header: cycle: 154
>>> ---------------------------------------------------------------------------- 
>>>
>>> Oper (0): tid: 5769cd60  len: 0  clientid: TRANS  flags: START
>>> ---------------------------------------------------------------------------- 
>>>
>>> Oper (1): tid: 5769cd60  len: 16  clientid: TRANS  flags: none
>>> TRAN:    type: STRAT_WRITE       tid: 0       num_items: 5
>>> ---------------------------------------------------------------------------- 
>>>
>>> Oper (2): tid: 5769cd60  len: 56  clientid: TRANS  flags: none
>>> INODE: #regs: 3   ino: 0x1400005c  flags: 0x5   dsize: 16
>>>         blkno: 610338736  len: 16  boff: 4096
>>> Oper (3): tid: 5769cd60  len: 96  clientid: TRANS  flags: none
>>> INODE CORE
>>> magic 0x494e mode 0100600 version 1 format 2
>>> nlink 1 uid 14029 gid 1474
>>> atime 0x4716b870 mtime 0x4716bade ctime 0x4716bade
>>> size 0x1ce00000 nblocks 0x1c840 extsize 0x0 nextents 0x1
>>> naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0
>>> flags 0x0 gen 0x4
>>> Oper (4): tid: 5769cd60  len: 16  clientid: TRANS  flags: none
>>> EXTENTS inode data
>>> ---------------------------------------------------------------------------- 
>>
>>
>>
>> Another interesting point is that the transaction id has changed between
>> these operations from 5769cd60 to 5769cf18.  It should stay the same.
>> We may be reading old log data.
>>
>>>
>>> Oper (5): tid: 5769cf18  len: 24  clientid: TRANS  flags: none
>>> BUF:  #regs: 2   start blkno: 610338561 (0x24610701)  len: 1  bmap 
>>> size: 1  flags: 0x0
>>> Oper (6): tid: 5769cf18  len: 128  clientid: TRANS  flags: none
>>> AGF Buffer: XAGF
>>> ver: 1  seq#: 5  len: 15258464
>>> root BNO: 1  CNT: 2
>>> level BNO: 1  CNT: 1
>>> 1st: 0  last: 3  cnt: 4  freeblks: 2491090  longest: 581471
>>> ---------------------------------------------------------------------------- 
>>>
>>> Oper (7): tid: 5769cf18  len: 28  clientid: TRANS  flags: none
>>> BUF:  #regs: 2   start blkno: 610338568 (0x24610708)  len: 8  bmap 
>>> size: 2  flags: 0x0
>>> Oper (8): tid: 5769cf18  len: 128  clientid: TRANS  flags: none
>>> BUF DATA
>>> ---------------------------------------------------------------------------- 
>>>
> 
> 
>

next prev parent reply	other threads:[~2007-12-05  7:04 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-28 13:45 2.6.24-rc3 oopses while mounting fs KELEMEN Peter
2007-11-28 23:56 ` Lachlan McIlroy
2007-11-30 22:31   ` KELEMEN Peter
2007-12-03  3:24     ` Lachlan McIlroy
2007-12-03 22:02       ` KELEMEN Peter
2007-12-04  0:11         ` Timothy Shimmin
2007-12-04  7:08           ` Lachlan McIlroy
2007-12-04  9:57             ` KELEMEN Peter
2007-12-05  5:47             ` Lachlan McIlroy
2007-12-05  6:38               ` Timothy Shimmin
2007-12-05  7:04                 ` Lachlan McIlroy [this message]
2007-12-04  9:42           ` KELEMEN Peter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47564D81.2040204@sgi.com \
    --to=lachlan@sgi.com \
    --cc=tes@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox