From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: with ECARTIS (v1.0.0; list xfs); Tue, 04 Dec 2007 23:04:09 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lB573rkl027788 for ; Tue, 4 Dec 2007 23:03:56 -0800 Message-ID: <47564D81.2040204@sgi.com> Date: Wed, 05 Dec 2007 18:04:33 +1100 From: Lachlan McIlroy Reply-To: lachlan@sgi.com MIME-Version: 1.0 Subject: Re: 2.6.24-rc3 oopses while mounting fs References: <20071128134523.GF7793@luba> <474E003A.7020000@sgi.com> <20071130223154.GB13589@luba> <47537709.9040406@sgi.com> <20071203220200.GC13667@luba> <47549B1A.4070508@sgi.com> <4754FCEC.30906@sgi.com> <47563B73.2030403@sgi.com> <47564761.9070805@sgi.com> In-Reply-To: <47564761.9070805@sgi.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com List-Id: xfs To: Timothy Shimmin Cc: xfs@oss.sgi.com Thanks Tim. Timothy Shimmin wrote: > Lachlan, > > An aside... > > I think I see why we couldn't get "xfs_logprint -t" to work on a file (-f). > > xfs_logprint -t -f ~/debug/sdc.xlg > xfs_logprint: > data device: 0xffffffffffffffff > log device: 0xffffffffffffffff daddr: 0 length: 262144 > > XFS: Log inconsistent (didn't find previous header) > XFS: failed to find log head > xfs_logprint: failed to find head and tail, error: 5 > > The problem is that it is using the superblock to determine if it is > version 1 > or version 2 logs in order to find out the maximum size of the iclog > (log record). > And for -f there is no superblock to look at. The version number is, > however, in the > log record header. > So when it looks in the undefined data it comes up with _not_ version 2, > and so assumes to look back for the hdr to only 32K when it really needs > to look back to 64k (for v2 it would limit at 256K). > Hence, it can't find the magic# in the header. > > The debugging... > > ==================================================== > Breakpoint 1, xlog_find_verify_log_record (log=0x607ffffffea96be0, > start_blk=5312, last_blk=0x607ffffffea96bc0, extra_bblks=0) > at xfs_log_recover.c:234 > 234 if (!(bp = xlog_get_bp(log, num_blks))) { > (gdb) bt > #0 xlog_find_verify_log_record (log=0x607ffffffea96be0, start_blk=5312, > last_blk=0x607ffffffea96bc0, extra_bblks=0) at xfs_log_recover.c:234 > #1 0x4000000000092020 in xlog_find_head (log=0x607ffffffea96be0, > return_head_blk=0x607ffffffea96bd0) at xfs_log_recover.c:527 > #2 0x4000000000094000 in xlog_find_tail (log=0x607ffffffea96be0, > head_blk=0x607ffffffea96bd0, tail_blk=0x607ffffffea96bd8, readonly=0) > at xfs_log_recover.c:616 > #3 0x40000000000100a0 in xfs_log_print_trans (log=0x607ffffffea96be0, > print_block_start=-1) at log_print_trans.c:49 > #4 0x4000000000003600 in main (argc=, argv= optimized out>) at logprint.c:240 > > (gdb) print *last_blk > $1 = 5376 > > start_blk = 5312 > last_blk = 5376 > extra_bblks = 0 > > Goes from 5376..5312 looking for magic# at start of sector. > Errors out if can't find one. > Which is true since the next magic# is at 5248. > > logprint -d output: > 4992 HEADER Cycle 154 tail 153:253056 len 61440 ops 105 > 5120 HEADER Cycle 154 tail 153:253056 len 61440 ops 344 > 5248 HEADER Cycle 154 tail 153:253056 len 61440 ops 303 <--- > previous block > 5376 HEADER Cycle 153 tail 153:253056 len 0 ops 0 <--- hdr > [00000 - 05376] Cycle 0x0000009a New Cycle 0x00000099 > 5377 HEADER Cycle 153 tail 153:253056 len 0 ops 0 > 5378 HEADER Cycle 153 tail 153:253056 len 0 ops 0 > > So why is it using a start_blk of 5312? > > > 522 num_scan_bblks = XLOG_REC_SHIFT(log); > 523 if (head_blk >= num_scan_bblks) { > 524 start_blk = head_blk - num_scan_bblks; /* don't > read head_blk */ > 525 > 526 /* start ptr at last block ptr before head_blk */ > ->527 if ((error = xlog_find_verify_log_record(log, > start_blk, > 528 > &head_blk, 0)) == -1) { > > #define XLOG_REC_SHIFT(log) \ > BTOBB(1 << (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) ? \ > XLOG_MAX_RECORD_BSHIFT : XLOG_BIG_RECORD_BSHIFT)) > > #define XLOG_BIG_RECORD_BSHIFT 15 /* 32k == 1 << 15 */ > #define XLOG_MAX_RECORD_BSHIFT 18 /* 256k == 1 << 18 */ > > Number of blocks between rec-hdrs from -d output: > 5376-5248 = 128 BB > > Looks like it used 5312 = 5376 - num_scan_bblks > => num_scan_bblks = 64 > > Yep > (gdb) p num_scan_bblks > $3 = 64 > 32K > i.e. it thinks it has v1 logs > > (gdb) p *(log->l_mp) > $9 = {m_sb = {sb_magicnum = 0, sb_blocksize = 0, sb_dblocks = 0, > sb_rblocks = 0, sb_rextents = 0, sb_uuid = '\0' , > sb_logstart = 2305843009213997776, sb_rootino = 6953557824646229696, > sb_rbmino = 6953557824646229688, sb_rsumino = 6953557824646229680, > sb_rextsize = 284472, sb_agblocks = 536870912, sb_agcount = > 4281151176, sb_rbmblocks = 1619001343, sb_logblocks = 0, sb_versionnum = 0, > sb_sectsize = 0, sb_inodesize = 38256, sb_inopblock = 4, sb_fname = > "\000\000\000 H?\a\000\000\000\000 ", sb_blocklog = 0 '\0', > etc... > it is not set > > Okay, need to file a bug for this part so that -f can be useful here. > It needs to be using a LR h_version number for this somehow. > > --Tim > > > Lachlan McIlroy wrote: >>> ============================================================================ >>> >>> cycle: 154 version: 2 lsn: 154,5248 tail_lsn: 153,253056 >>> length of Log Record: 61440 prev offset: 5120 num ops: 303 >>> uuid: b02fc6bd-662f-40a5-8f0c-e260a881a3e7 format: little endian linux >>> h_size: 262144 >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> extended-header: cycle: 154 >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> extended-header: cycle: 154 >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> extended-header: cycle: 154 >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> extended-header: cycle: 154 >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> extended-header: cycle: 154 >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> extended-header: cycle: 154 >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> extended-header: cycle: 154 >>> ---------------------------------------------------------------------------- >>> >>> Oper (0): tid: 5769cd60 len: 0 clientid: TRANS flags: START >>> ---------------------------------------------------------------------------- >>> >>> Oper (1): tid: 5769cd60 len: 16 clientid: TRANS flags: none >>> TRAN: type: STRAT_WRITE tid: 0 num_items: 5 >>> ---------------------------------------------------------------------------- >>> >>> Oper (2): tid: 5769cd60 len: 56 clientid: TRANS flags: none >>> INODE: #regs: 3 ino: 0x1400005c flags: 0x5 dsize: 16 >>> blkno: 610338736 len: 16 boff: 4096 >>> Oper (3): tid: 5769cd60 len: 96 clientid: TRANS flags: none >>> INODE CORE >>> magic 0x494e mode 0100600 version 1 format 2 >>> nlink 1 uid 14029 gid 1474 >>> atime 0x4716b870 mtime 0x4716bade ctime 0x4716bade >>> size 0x1ce00000 nblocks 0x1c840 extsize 0x0 nextents 0x1 >>> naextents 0x0 forkoff 0 dmevmask 0x0 dmstate 0x0 >>> flags 0x0 gen 0x4 >>> Oper (4): tid: 5769cd60 len: 16 clientid: TRANS flags: none >>> EXTENTS inode data >>> ---------------------------------------------------------------------------- >> >> >> >> Another interesting point is that the transaction id has changed between >> these operations from 5769cd60 to 5769cf18. It should stay the same. >> We may be reading old log data. >> >>> >>> Oper (5): tid: 5769cf18 len: 24 clientid: TRANS flags: none >>> BUF: #regs: 2 start blkno: 610338561 (0x24610701) len: 1 bmap >>> size: 1 flags: 0x0 >>> Oper (6): tid: 5769cf18 len: 128 clientid: TRANS flags: none >>> AGF Buffer: XAGF >>> ver: 1 seq#: 5 len: 15258464 >>> root BNO: 1 CNT: 2 >>> level BNO: 1 CNT: 1 >>> 1st: 0 last: 3 cnt: 4 freeblks: 2491090 longest: 581471 >>> ---------------------------------------------------------------------------- >>> >>> Oper (7): tid: 5769cf18 len: 28 clientid: TRANS flags: none >>> BUF: #regs: 2 start blkno: 610338568 (0x24610708) len: 8 bmap >>> size: 2 flags: 0x0 >>> Oper (8): tid: 5769cf18 len: 128 clientid: TRANS flags: none >>> BUF DATA >>> ---------------------------------------------------------------------------- >>> > > >