RAID1 submirror failure causes reboot?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID1 submirror failure causes reboot?
@ 2006-11-10  8:17 Jim Klimov
  2006-11-10  8:41 ` Neil Brown
  0 siblings, 1 reply; 8+ messages in thread
From: Jim Klimov @ 2006-11-10  8:17 UTC (permalink / raw)
  To: Linux RAID

Hello Linux RAID,

  One of our servers using per-partition mirroring has a
  frequently-failing partition, hdc11 below.

  When it is dubbed failing, the server usually crashes
  with a stacktrace like below. This seems strange, because
  the other submirror, hda11 is alive and well, and this
  should all be transparent thru the RAID layer? This is
  what it's for?

  After the reboot I usually succeed in hot-adding hdc11
  back to the mirror, although several times it was not
  marked dead at all and rebuilt by itself after reboot.
  Also seems rather incorrect: if it died, it should be
  marked so (perhaps in metadata on a live mirror)?

  Overall, uncool (although mirroring has saved us many
  times, thanks!)

Nov 10 03:56:51 video kernel: [84443.270516] md: syncing RAID array md11
Nov 10 03:56:52 video kernel: [84443.270532] md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/
disc.
Nov 10 03:56:54 video kernel: [84443.270544] md: using maximum available idle IO bandwidth (but not more
 than 200000 KB/sec) for reconstruction.
Nov 10 03:56:55 video kernel: [84443.270565] md: using 128k window, over a total of 65430144 blocks.
Nov 10 03:56:56 video kernel: [84443.271478] RAID1 conf printout:
Nov 10 03:57:01 video kernel: [84443.275446]  --- wd:2 rd:2
Nov 10 03:57:10 video kernel: [84443.278773]  disk 0, wo:0, o:1, dev:hdc10
Nov 10 03:57:11 video kernel: [84443.283272]  disk 1, wo:0, o:1, dev:hda10
[87319.049902] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[87319.057393] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631
[87319.067205] ide: failed opcode was: unknown
[87323.956399] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[87323.963681] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631
[87323.973171] ide: failed opcode was: unknown
[87328.846265] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[87328.853485] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631
[87328.862834] ide: failed opcode was: unknown
[87333.736127] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[87333.743535] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631
[87333.752876] ide: failed opcode was: unknown
[87333.806569] ide1: reset: success
[87338.675891] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87338.685143] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87338.694791] ide: failed opcode was: unknown
[87343.557424] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87343.566388] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87343.576105] ide: failed opcode was: unknown
[87348.472226] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87348.481170] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87348.490843] ide: failed opcode was: unknown
[87353.387028] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87353.395735] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87353.405500] ide: failed opcode was: unknown
[87353.461342] ide1: reset: success
[87358.326783] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87358.335739] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87358.345395] ide: failed opcode was: unknown
[87363.208313] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87363.217319] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87363.228371] ide: failed opcode was: unknown
[87368.106472] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87368.115414] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87368.125275] ide: failed opcode was: unknown
[87372.979686] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87372.988706] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87372.998849] ide: failed opcode was: unknown
[87373.052152] ide1: reset: success
[87377.927744] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87377.936682] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87377.946399] ide: failed opcode was: unknown
[87382.800953] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87382.809881] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87382.819511] ide: failed opcode was: unknown
[87387.682479] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87387.691473] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87387.701287] ide: failed opcode was: unknown
[87392.564004] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87392.572790] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87392.582454] ide: failed opcode was: unknown
[87392.635961] ide1: reset: success
[87397.528687] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87397.537607] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87397.547335] ide: failed opcode was: unknown
[87397.551897] end_request: I/O error, dev hdc, sector 176315718
[87398.520820] raid1: Disk failure on hdc11, disabling device. 
[87398.520826]  Operation continuing on 1 devices
[87398.531579] blk: request botched
[87398.535098] hdc: task_out_intr: status=0x50 { DriveReady SeekComplete }
[87398.542129] ide: failed opcode was: unknown
[87403.582775] ------------[ cut here ]------------
[87403.587748] kernel BUG at mm/filemap.c:541!
[87403.592082] invalid opcode: 0000 [#1]
[87403.596063] SMP 
[87403.598217] Modules linked in: w83781d hwmon_vid i2c_isa i2c_core w83627hf_wdt
[87403.606114] CPU:    0
[87403.606117] EIP:    0060:[<c01406a7>]    Not tainted VLI
[87403.606120] EFLAGS: 00010046   (2.6.18.2debug #1) 
[87403.619728] EIP is at unlock_page+0x12/0x2d
[87403.624170] eax: 00000000   ebx: c2d5caa8   ecx: e8148680   edx: c2d5caa8
[87403.631543] esi: da71c600   edi: 00000001   ebp: c04cfe28   esp: c04cfe24
[87403.638924] ds: 007b   es: 007b   ss: 0068
[87403.643419] Process swapper (pid: 0, ti=c04ce000 task=c041e500 task.ti=c04ce000)
[87403.650774] Stack: e81487e8 c04cfe3c c0180e0a da71c600 00000000 c0180dac c04cfe64 c0164af9 
[87403.659985]        f7d49000 c04cfe84 f2dea5a0 f2dea5a0 00000000 da71c600 00000000 da71c600 
[87403.669288]        c04cfea8 c0256778 c041e500 00000000 c04cbd90 00000046 00000000 00000000 
[87403.678603] Call Trace:
[87403.681462]  [<c0103bba>] show_stack_log_lvl+0x8d/0xaa
[87403.686911]  [<c0103ddc>] show_registers+0x1b0/0x221
[87403.692306]  [<c0103ffc>] die+0x124/0x1ee
[87403.696558]  [<c0104165>] do_trap+0x9f/0xa1
[87403.700988]  [<c0104427>] do_invalid_op+0xa7/0xb1
[87403.706012]  [<c0103871>] error_code+0x39/0x40
[87403.710794]  [<c0180e0a>] mpage_end_io_read+0x5e/0x72
[87403.716154]  [<c0164af9>] bio_endio+0x56/0x7b
[87403.720798]  [<c0256778>] __end_that_request_first+0x1e0/0x301
[87403.726985]  [<c02568a4>] end_that_request_first+0xb/0xd
[87403.732699]  [<c02bd73c>] __ide_end_request+0x54/0xe1
[87403.738214]  [<c02bd807>] ide_end_request+0x3e/0x5c
[87403.743382]  [<c02c35df>] task_error+0x5b/0x97
[87403.748113]  [<c02c36fa>] task_in_intr+0x6e/0xa2
[87403.753120]  [<c02bf19e>] ide_intr+0xaf/0x12c
[87403.757815]  [<c013e5a7>] handle_IRQ_event+0x23/0x57
[87403.763135]  [<c013e66f>] __do_IRQ+0x94/0xfd
[87403.767802]  [<c0105192>] do_IRQ+0x32/0x68
[87403.772278]  [<c010372e>] common_interrupt+0x1a/0x20
[87403.777586]  [<c0100cfe>] cpu_idle+0x7d/0x86
[87403.782184]  [<c01002b7>] rest_init+0x23/0x25
[87403.786869]  [<c04d4889>] start_kernel+0x175/0x19d
[87403.791963]  [<00000000>] 0x0
[87403.795270] Code: ff ff ff b9 0b 00 14 c0 8d 55 dc c7 04 24 02 00 00 00 e8 21 26 25 00 eb dc 55 89 e5
 53 89 c3 31 c0 f0 0f b3 03 19 c0 85 c0 75 08 <0f> 0b 1d 02 6c bf 3b c0 89 d8 e8 34 ff ff ff 89 da 31 c9
 e8 24 
[87403.819040] EIP: [<c01406a7>] unlock_page+0x12/0x2d SS:ESP 0068:c04cfe24
[87403.826101]  <0>Kernel panic - not syncing: Fatal exception in interrupt  

-- 
Best regards,
 Jim Klimov                          mailto:klimov@2ka.mipt.ru


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RAID1 submirror failure causes reboot?
  2006-11-10  8:17 RAID1 submirror failure causes reboot? Jim Klimov
@ 2006-11-10  8:41 ` Neil Brown
  2006-11-10 12:53   ` Re[2]: " Jim Klimov
  0 siblings, 1 reply; 8+ messages in thread
From: Neil Brown @ 2006-11-10  8:41 UTC (permalink / raw)
  To: Jim Klimov; +Cc: Linux RAID, jens.axboe

On Friday November 10, klimov@2ka.mipt.ru wrote:
> Hello Linux RAID,
> 
>   One of our servers using per-partition mirroring has a
>   frequently-failing partition, hdc11 below.
> 
>   When it is dubbed failing, the server usually crashes
>   with a stacktrace like below. This seems strange, because
>   the other submirror, hda11 is alive and well, and this
>   should all be transparent thru the RAID layer? This is
>   what it's for?
> 
>   After the reboot I usually succeed in hot-adding hdc11
>   back to the mirror, although several times it was not
>   marked dead at all and rebuilt by itself after reboot.
>   Also seems rather incorrect: if it died, it should be
>   marked so (perhaps in metadata on a live mirror)?
> 
>   Overall, uncool (although mirroring has saved us many
>   times, thanks!)
> 
--snip--
> [87392.564004] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> [87392.572790] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
> [87392.582454] ide: failed opcode was: unknown
> [87392.635961] ide1: reset: success
> [87397.528687] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> [87397.537607] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
> [87397.547335] ide: failed opcode was: unknown
> [87397.551897] end_request: I/O error, dev hdc, sector 176315718
> [87398.520820] raid1: Disk failure on hdc11, disabling device. 
> [87398.520826]  Operation continuing on 1 devices
> [87398.531579] blk: request botched
                 ^^^^^^^^^^^^^^^^^^^^

That looks bad.  Possible some bug in the IDE controller or elsewhere
in the block layer.  Jens: What might cause that?

--snip--
> [87403.678603] Call Trace:
> [87403.681462]  [<c0103bba>] show_stack_log_lvl+0x8d/0xaa
> [87403.686911]  [<c0103ddc>] show_registers+0x1b0/0x221
> [87403.692306]  [<c0103ffc>] die+0x124/0x1ee
> [87403.696558]  [<c0104165>] do_trap+0x9f/0xa1
> [87403.700988]  [<c0104427>] do_invalid_op+0xa7/0xb1
> [87403.706012]  [<c0103871>] error_code+0x39/0x40
> [87403.710794]  [<c0180e0a>] mpage_end_io_read+0x5e/0x72
> [87403.716154]  [<c0164af9>] bio_endio+0x56/0x7b
> [87403.720798]  [<c0256778>] __end_that_request_first+0x1e0/0x301
> [87403.726985]  [<c02568a4>] end_that_request_first+0xb/0xd
> [87403.732699]  [<c02bd73c>] __ide_end_request+0x54/0xe1
> [87403.738214]  [<c02bd807>] ide_end_request+0x3e/0x5c
> [87403.743382]  [<c02c35df>] task_error+0x5b/0x97
> [87403.748113]  [<c02c36fa>] task_in_intr+0x6e/0xa2
> [87403.753120]  [<c02bf19e>] ide_intr+0xaf/0x12c
> [87403.757815]  [<c013e5a7>] handle_IRQ_event+0x23/0x57
> [87403.763135]  [<c013e66f>] __do_IRQ+0x94/0xfd
> [87403.767802]  [<c0105192>] do_IRQ+0x32/0x68

That doesn't look like raid was involved.  If it was you would expect
to see raid1_end_write_request or raid1_end_read_request in that
trace. 
Do you have any other partitions of hdc in use but not on raid?
Which partition is sector 176315718 in ??

NeilBrown

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re[2]: RAID1 submirror failure causes reboot?
  2006-11-10  8:41 ` Neil Brown
@ 2006-11-10 12:53   ` Jim Klimov
  2006-11-13  7:17     ` Neil Brown
  0 siblings, 1 reply; 8+ messages in thread
From: Jim Klimov @ 2006-11-10 12:53 UTC (permalink / raw)
  To: Neil Brown; +Cc: Linux RAID

Hello Neil,

>> [87398.531579] blk: request botched
NB>                  ^^^^^^^^^^^^^^^^^^^^

NB> That looks bad.  Possible some bug in the IDE controller or elsewhere
NB> in the block layer.  Jens: What might cause that?

NB> --snip--

NB> That doesn't look like raid was involved.  If it was you would expect
NB> to see raid1_end_write_request or raid1_end_read_request in that
NB> trace.
So that might be the hard or soft part of IDE layer failing the
system, or a PCI problem for example?

Just in case, motherboard is Supermicro X5DPE-G2: e7501 chipset,
Dual Xeon-533. Rather old now, way cool back when it was bought ;)
But, as anything new, it could be manufactured bad. Maybe now there
are some well-known problems with the tech?

NB> Do you have any other partitions of hdc in use but not on raid?
NB> Which partition is sector 176315718 in ??
All partitions are mirrored, and this sector is in hdc11:

# fdisk /dev/hdc
...
Command (m for help): u
Changing display/entry units to sectors

Command (m for help): p

Disk /dev/hdc: 102.9 GB, 102935347200 bytes
16 heads, 63 sectors/track, 199450 cylinders, total 201045600 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hdc1   *          63     1000943      500440+  fd  Linux raid autodetect
/dev/hdc2         1000944     5001695     2000376   fd  Linux raid autodetect
/dev/hdc3         5001696    16721711     5860008   83  Linux
/dev/hdc4        16721712   201045599    92161944    5  Extended
/dev/hdc5        16721775    24534719     3906472+  83  Linux
/dev/hdc6        24534783    28441727     1953472+  83  Linux
/dev/hdc7        28441791    32348735     1953472+  83  Linux
/dev/hdc8        32348799    44068751     5859976+  83  Linux
/dev/hdc9        44068815    51881759     3906472+  83  Linux
/dev/hdc10       51881823    59694767     3906472+  83  Linux
/dev/hdc11       59694831   190555343    65430256+  83  Linux
/dev/hdc12      190555407   201045599     5245096+  83  Linux

-- 
Best regards,
 Jim Klimov                            mailto:klimov@2ka.mipt.ru

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re[2]: RAID1 submirror failure causes reboot?
  2006-11-10 12:53   ` Re[2]: " Jim Klimov
@ 2006-11-13  7:17     ` Neil Brown
  2006-11-13 20:11       ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Neil Brown @ 2006-11-13  7:17 UTC (permalink / raw)
  To: Jim Klimov; +Cc: Jens Axboe, Linux RAID

On Friday November 10, klimov@2ka.mipt.ru wrote:
> Hello Neil,
> 
> >> [87398.531579] blk: request botched
> NB>                  ^^^^^^^^^^^^^^^^^^^^
> 
> NB> That looks bad.  Possible some bug in the IDE controller or elsewhere
> NB> in the block layer.  Jens: What might cause that?
> 
> NB> --snip--
> 
> NB> That doesn't look like raid was involved.  If it was you would expect
> NB> to see raid1_end_write_request or raid1_end_read_request in that
> NB> trace.
> So that might be the hard or soft part of IDE layer failing the
> system, or a PCI problem for example?

What I think is happening here (and Jens: if you could tell me how
impossible this is, that would be good) is this:

Some error handling somewhere in the low-level ide driver is getting
confused and somehow one the sector counts in the 'struct request' is
getting set wrongly.  blk_recalc_rq_sectors notices this and says
"blk: request botched".  It tries to auto-correct by increasing
rq->nr_sectors to be consistent with other counts.
I'm *guessing* this is the wrong thing to do, and that it has a
side-effect but bi_end_io is getting called on the Bi twice.
The second time the bio has been freed and reused and the wrong
b_end_io is called and it does the wrong thing.

This sounds a bit far-fetched, but it is the only explanation I can
come up with for the observed back trace which is:

[87403.706012]  [<c0103871>] error_code+0x39/0x40
[87403.710794]  [<c0180e0a>] mpage_end_io_read+0x5e/0x72
[87403.716154]  [<c0164af9>] bio_endio+0x56/0x7b
[87403.720798]  [<c0256778>] __end_that_request_first+0x1e0/0x301
[87403.726985]  [<c02568a4>] end_that_request_first+0xb/0xd
[87403.732699]  [<c02bd73c>] __ide_end_request+0x54/0xe1
[87403.738214]  [<c02bd807>] ide_end_request+0x3e/0x5c
[87403.743382]  [<c02c35df>] task_error+0x5b/0x97
[87403.748113]  [<c02c36fa>] task_in_intr+0x6e/0xa2
[87403.753120]  [<c02bf19e>] ide_intr+0xaf/0x12c
[87403.757815]  [<c013e5a7>] handle_IRQ_event+0x23/0x57
[87403.763135]  [<c013e66f>] __do_IRQ+0x94/0xfd
[87403.767802]  [<c0105192>] do_IRQ+0x32/0x68
[87403.772278]  [<c010372e>] common_interrupt+0x1a/0x20

i.e. bio_endio goes straight to mpage_end_io despite the face that the
filesystem is mounted over md/raid1.

Is the kernel compiled with CONFIG_DEBUG_SLAB=y and
CONFIG_DEBUG_PAGEALLOC=y ??
They might help trigger the error earlier and so make the problem more
obvious.

NeilBrown

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re[2]: RAID1 submirror failure causes reboot?
  2006-11-13  7:17     ` Neil Brown
@ 2006-11-13 20:11       ` Jens Axboe
  2006-11-13 22:05         ` Neil Brown
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2006-11-13 20:11 UTC (permalink / raw)
  To: Neil Brown; +Cc: Jim Klimov, Linux RAID

On Mon, Nov 13 2006, Neil Brown wrote:
> On Friday November 10, klimov@2ka.mipt.ru wrote:
> > Hello Neil,
> > 
> > >> [87398.531579] blk: request botched
> > NB>                  ^^^^^^^^^^^^^^^^^^^^
> > 
> > NB> That looks bad.  Possible some bug in the IDE controller or elsewhere
> > NB> in the block layer.  Jens: What might cause that?
> > 
> > NB> --snip--
> > 
> > NB> That doesn't look like raid was involved.  If it was you would expect
> > NB> to see raid1_end_write_request or raid1_end_read_request in that
> > NB> trace.
> > So that might be the hard or soft part of IDE layer failing the
> > system, or a PCI problem for example?
> 
> What I think is happening here (and Jens: if you could tell me how
> impossible this is, that would be good) is this:
> 
> Some error handling somewhere in the low-level ide driver is getting
> confused and somehow one the sector counts in the 'struct request' is
> getting set wrongly.  blk_recalc_rq_sectors notices this and says
> "blk: request botched".  It tries to auto-correct by increasing
> rq->nr_sectors to be consistent with other counts.
> I'm *guessing* this is the wrong thing to do, and that it has a
> side-effect but bi_end_io is getting called on the Bi twice.
> The second time the bio has been freed and reused and the wrong
> b_end_io is called and it does the wrong thing.

It doesn't sound at all unreasonable. It's most likely either a bug in
the ide driver, or a "bad" bio being passed to the block layer (and
later on to the request and driver). By "bad" I mean one that isn't
entirely consistent, which could be a bug in eg md. The "request
botched" error is usually a sign of something being severly screwed up.
As you mention further down, get slab and page debugging enabled to
potentially catch this earlier. It could be a sign of a freed bio or
request with corrupt contents.

> This sounds a bit far-fetched, but it is the only explanation I can
> come up with for the observed back trace which is:
> 
> [87403.706012]  [<c0103871>] error_code+0x39/0x40
> [87403.710794]  [<c0180e0a>] mpage_end_io_read+0x5e/0x72
> [87403.716154]  [<c0164af9>] bio_endio+0x56/0x7b
> [87403.720798]  [<c0256778>] __end_that_request_first+0x1e0/0x301
> [87403.726985]  [<c02568a4>] end_that_request_first+0xb/0xd
> [87403.732699]  [<c02bd73c>] __ide_end_request+0x54/0xe1
> [87403.738214]  [<c02bd807>] ide_end_request+0x3e/0x5c
> [87403.743382]  [<c02c35df>] task_error+0x5b/0x97
> [87403.748113]  [<c02c36fa>] task_in_intr+0x6e/0xa2
> [87403.753120]  [<c02bf19e>] ide_intr+0xaf/0x12c
> [87403.757815]  [<c013e5a7>] handle_IRQ_event+0x23/0x57
> [87403.763135]  [<c013e66f>] __do_IRQ+0x94/0xfd
> [87403.767802]  [<c0105192>] do_IRQ+0x32/0x68
> [87403.772278]  [<c010372e>] common_interrupt+0x1a/0x20
> 
> i.e. bio_endio goes straight to mpage_end_io despite the face that the
> filesystem is mounted over md/raid1.

What is the workload? Is io to the real device mixed with io that came
through md as well?

> Is the kernel compiled with CONFIG_DEBUG_SLAB=y and
> CONFIG_DEBUG_PAGEALLOC=y ??
> They might help trigger the error earlier and so make the problem more
> obvious.

Agree, that would be a good plan to enable. Other questions: are you
seeing timeouts at any point? The ide timeout code has some request/bio
"resetting" code which might be worrisome.
> 
> NeilBrown

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re[2]: RAID1 submirror failure causes reboot?
  2006-11-13 20:11       ` Jens Axboe
@ 2006-11-13 22:05         ` Neil Brown
  2006-11-14  7:28           ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Neil Brown @ 2006-11-13 22:05 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jim Klimov, Linux RAID

On Monday November 13, jens.axboe@oracle.com wrote:
> 
> It doesn't sound at all unreasonable. It's most likely either a bug in
> the ide driver, or a "bad" bio being passed to the block layer (and
> later on to the request and driver). By "bad" I mean one that isn't
> entirely consistent, which could be a bug in eg md. 

I just noticed (while tracking raid6 problems...) that bio_clone calls
bio_phys_segments and bio_hw_segments (why does it do both?).
This calls blk_recount_segments which does calculations based on
     ->bi_bdev.
Only immediately after calling bio_clone, raid1 changes bi_bdev, thus
creating potential inconsistency in the bio.  Would this sort of
inconsistency cause this problem?

> 
> Agree, that would be a good plan to enable. Other questions: are you
> seeing timeouts at any point? The ide timeout code has some request/bio
> "resetting" code which might be worrisome.

Jim could probably answer this with more authority, but there aren't
obvious timeouts from the logs he posted.  A representative sample is:
[87338.675891] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87338.685143] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87338.694791] ide: failed opcode was: unknown
[87343.557424] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87343.566388] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87343.576105] ide: failed opcode was: unknown
[87348.472226] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87348.481170] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87348.490843] ide: failed opcode was: unknown
[87353.387028] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87353.395735] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87353.405500] ide: failed opcode was: unknown
[87353.461342] ide1: reset: success

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re[2]: RAID1 submirror failure causes reboot?
  2006-11-13 22:05         ` Neil Brown
@ 2006-11-14  7:28           ` Jens Axboe
  2006-11-14 10:36             ` Re[4]: " Jim Klimov
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2006-11-14  7:28 UTC (permalink / raw)
  To: Neil Brown; +Cc: Jim Klimov, Linux RAID

On Tue, Nov 14 2006, Neil Brown wrote:
> On Monday November 13, jens.axboe@oracle.com wrote:
> > 
> > It doesn't sound at all unreasonable. It's most likely either a bug in
> > the ide driver, or a "bad" bio being passed to the block layer (and
> > later on to the request and driver). By "bad" I mean one that isn't
> > entirely consistent, which could be a bug in eg md. 
> 
> I just noticed (while tracking raid6 problems...) that bio_clone calls
> bio_phys_segments and bio_hw_segments (why does it do both?).
> This calls blk_recount_segments which does calculations based on
>      ->bi_bdev.
> Only immediately after calling bio_clone, raid1 changes bi_bdev, thus
> creating potential inconsistency in the bio.  Would this sort of
> inconsistency cause this problem?

raid1 should change it first, you are right. But it should not matter,
as the real device should have restrictions that are at least equal to
the md device. So it may be a bit more conservative, but I don't think
there's a problem bug there.

> > Agree, that would be a good plan to enable. Other questions: are you
> > seeing timeouts at any point? The ide timeout code has some request/bio
> > "resetting" code which might be worrisome.
> 
> Jim could probably answer this with more authority, but there aren't
> obvious timeouts from the logs he posted.  A representative sample is:
> [87338.675891] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> [87338.685143] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
> [87338.694791] ide: failed opcode was: unknown
> [87343.557424] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> [87343.566388] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
> [87343.576105] ide: failed opcode was: unknown
> [87348.472226] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> [87348.481170] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
> [87348.490843] ide: failed opcode was: unknown
> [87353.387028] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> [87353.395735] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
> [87353.405500] ide: failed opcode was: unknown
> [87353.461342] ide1: reset: success

Then lets wait for Jim to repeat his testing with all the debugging
options enabled, that should make us a little wiser.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re[4]: RAID1 submirror failure causes reboot?
  2006-11-14  7:28           ` Jens Axboe
@ 2006-11-14 10:36             ` Jim Klimov
  0 siblings, 0 replies; 8+ messages in thread
From: Jim Klimov @ 2006-11-14 10:36 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jim Klimov, Linux RAID

Hello Jens,

JA> Then lets wait for Jim to repeat his testing with all the debugging
JA> options enabled, that should make us a little wiser.
Ok, I'll enable the kernel rebuilt with these options and report any
findings later. So far, I'll move to the other questions aroused.

I remember when I ran with most of debugging ON last week, it made the
server run darn slow, with LA around 40-60 and little responsiveness,
and numerous messages like: 
  Hangcheck: hangcheck value past margin!
  BUG: soft lockup detected on CPU#1!
which hide the interesting traces :)
However, I can post these captured traces of several system lifetimes
to the list or privately.

Concerning other questions:
1) The workload on the software raid is rather small. It's a set of
   system partitions which keep fileserver's logs, etc. The file
   storage is on 3Ware cards and has substantial load. MD arrays are
   checked nightly, though (echo check > sync_action), and most often
   this triggers the problem. These drives only contain mirrored
   partitions, so there should be no I/O to these drives around
   MD, except for rare cases of lilo running :)

2) I have installed a third submirror disk this weekend, it's an IDE
   slave device hdd (near the failing hdc). Since then I got errors
   on other partitions, attached below as "2*)".
   
3) The failures which lead to reboots are usually preceded by a long
   history of dma_intr errors 0x40 and 0x51, but that sample I sent
   was rather full. A few errors preceded it every 5 seconds, making
   the full trace like this:

[87319.049902] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[87319.057393] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631
[87319.067205] ide: failed opcode was: unknown
[87323.956399] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[87323.963681] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631
[87323.973171] ide: failed opcode was: unknown
[87328.846265] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[87328.853485] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631
[87328.862834] ide: failed opcode was: unknown
[87333.736127] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[87333.743535] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631
[87333.752876] ide: failed opcode was: unknown
[87333.806569] ide1: reset: success
[87338.675891] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87338.685143] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87338.694791] ide: failed opcode was: unknown
[87343.557424] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87343.566388] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87343.576105] ide: failed opcode was: unknown
[87348.472226] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87348.481170] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87348.490843] ide: failed opcode was: unknown
[87353.387028] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87353.395735] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711
[87353.405500] ide: failed opcode was: unknown
[87353.461342] ide1: reset: success
[87358.326783] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87358.335739] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87358.345395] ide: failed opcode was: unknown
[87363.208313] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87363.217319] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87363.228371] ide: failed opcode was: unknown
[87368.106472] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87368.115414] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87368.125275] ide: failed opcode was: unknown
[87372.979686] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87372.988706] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87372.998849] ide: failed opcode was: unknown
[87373.052152] ide1: reset: success
[87377.927744] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87377.936682] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87377.946399] ide: failed opcode was: unknown
[87382.800953] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87382.809881] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87382.819511] ide: failed opcode was: unknown
[87387.682479] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87387.691473] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87387.701287] ide: failed opcode was: unknown
[87392.564004] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87392.572790] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87392.582454] ide: failed opcode was: unknown
[87392.635961] ide1: reset: success
[87397.528687] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[87397.537607] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718
[87397.547335] ide: failed opcode was: unknown
[87397.551897] end_request: I/O error, dev hdc, sector 176315718
[87398.520820] raid1: Disk failure on hdc11, disabling device. 
[87398.520826]  Operation continuing on 1 devices
[87398.531579] blk: request botched
[87398.535098] hdc: task_out_intr: status=0x50 { DriveReady SeekComplete }
[87398.542129] ide: failed opcode was: unknown
[87403.582775] ------------[ cut here ]------------
[87403.587748] kernel BUG at mm/filemap.c:541!
[87403.592082] invalid opcode: 0000 [#1]
[87403.596063] SMP 
[87403.598217] Modules linked in: w83781d hwmon_vid i2c_isa i2c_core w83627hf_wdt
[87403.606114] CPU:    0
[87403.606117] EIP:    0060:[<c01406a7>]    Not tainted VLI
[87403.606120] EFLAGS: 00010046   (2.6.18.2debug #1) 
[87403.619728] EIP is at unlock_page+0x12/0x2d
[87403.624170] eax: 00000000   ebx: c2d5caa8   ecx: e8148680   edx: c2d5caa8
[87403.631543] esi: da71c600   edi: 00000001   ebp: c04cfe28   esp: c04cfe24
[87403.638924] ds: 007b   es: 007b   ss: 0068
[87403.643419] Process swapper (pid: 0, ti=c04ce000 task=c041e500 task.ti=c04ce000)
[87403.650774] Stack: e81487e8 c04cfe3c c0180e0a da71c600 00000000 c0180dac c04cfe64 c0164af9 
[87403.659985]        f7d49000 c04cfe84 f2dea5a0 f2dea5a0 00000000 da71c600 00000000 da71c600 
[87403.669288]        c04cfea8 c0256778 c041e500 00000000 c04cbd90 00000046 00000000 00000000 
[87403.678603] Call Trace:
[87403.681462]  [<c0103bba>] show_stack_log_lvl+0x8d/0xaa
[87403.686911]  [<c0103ddc>] show_registers+0x1b0/0x221
[87403.692306]  [<c0103ffc>] die+0x124/0x1ee
[87403.696558]  [<c0104165>] do_trap+0x9f/0xa1
[87403.700988]  [<c0104427>] do_invalid_op+0xa7/0xb1
[87403.706012]  [<c0103871>] error_code+0x39/0x40
[87403.710794]  [<c0180e0a>] mpage_end_io_read+0x5e/0x72
[87403.716154]  [<c0164af9>] bio_endio+0x56/0x7b
[87403.720798]  [<c0256778>] __end_that_request_first+0x1e0/0x301
[87403.726985]  [<c02568a4>] end_that_request_first+0xb/0xd
[87403.732699]  [<c02bd73c>] __ide_end_request+0x54/0xe1
[87403.738214]  [<c02bd807>] ide_end_request+0x3e/0x5c
[87403.743382]  [<c02c35df>] task_error+0x5b/0x97
[87403.748113]  [<c02c36fa>] task_in_intr+0x6e/0xa2
[87403.753120]  [<c02bf19e>] ide_intr+0xaf/0x12c
[87403.757815]  [<c013e5a7>] handle_IRQ_event+0x23/0x57
[87403.763135]  [<c013e66f>] __do_IRQ+0x94/0xfd
[87403.767802]  [<c0105192>] do_IRQ+0x32/0x68
[87403.772278]  [<c010372e>] common_interrupt+0x1a/0x20
[87403.777586]  [<c0100cfe>] cpu_idle+0x7d/0x86
[87403.782184]  [<c01002b7>] rest_init+0x23/0x25
[87403.786869]  [<c04d4889>] start_kernel+0x175/0x19d
[87403.791963]  [<00000000>] 0x0
[87403.795270] Code: ff ff ff b9 0b 00 14 c0 8d f0 <0>Kernel panic - not syncing: Fatal exception in interrupt
smp_call_function+0x12b/0x130
[87403.935200]  [<c013e66f>] 


2*) The new problem log:

... rebuilding new spare partitions into a 3-way array went well...
[114683.721498]  --- wd:3 rd:3
[114683.721508]  disk 0, wo:0, o:1, dev:hdc10
[114683.721541]  disk 1, wo:0, o:1, dev:hda10
[114683.721556]  disk 2, wo:0, o:1, dev:hdd10
[115150.208060] RAID1 conf printout:
[115150.211893]  --- wd:3 rd:3
[115150.214923]  disk 0, wo:0, o:1, dev:hdc8
[115150.219269]  disk 1, wo:0, o:1, dev:hda8
[115150.224022]  disk 2, wo:0, o:1, dev:hdd8
[115296.149849] RAID1 conf printout:
[115296.149868]  --- wd:3 rd:3
[115296.149895]  disk 0, wo:0, o:1, dev:hdc7
[115296.149915]  disk 1, wo:0, o:1, dev:hda7
[115296.149931]  disk 2, wo:0, o:1, dev:hdd7
[115460.641824] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[115460.649294] hdc: dma_intr: error=0x40 { UncorrectableError }, LBAsect=28433205, sector=28433023
[115460.658774] ide: failed opcode was: unknown
[115460.663287] end_request: I/O error, dev hdc, sector 28433023
[115465.938995] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[115465.946451] hdc: dma_intr: error=0x40 { UncorrectableError }, LBAsect=28433201, sector=28433031
[115465.957536] ide: failed opcode was: unknown
[115465.962041] end_request: I/O error, dev hdc, sector 28433031
[115488.743391] hdc: dma_timer_expiry: dma status == 0x60
[115488.748714] hdc: DMA timeout retry
[115488.752450] hdc: timeout waiting for DMA
[115488.979377] hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
[115488.987829] ide: failed opcode was: unknown
[115488.992469] hdc: drive not ready for command
[115518.991827] hdc: lost interrupt
[115518.995201] hdc: task_out_intr: status=0x58 { DriveReady SeekComplete DataRequest }
[115519.003493] ide: failed opcode was: unknown
[115524.009004] hdc: status timeout: status=0xd0 { Busy }
[115524.014441] ide: failed opcode was: unknown
[115524.018988] hdc: drive not ready for command
[115524.114548] ide1: reset: success
[115524.198619] raid1: Disk failure on hdc6, disabling device. 
[115524.198624]         Operation continuing on 2 devices
[115524.209260] hdc: task_out_intr: status=0x50 { DriveReady SeekComplete }
[115544.209530] hdc: dma_timer_expiry: dma status == 0x20
[115544.215027] hdc: DMA timeout retry
[115544.218669] hdc: timeout waiting for DMA
[115544.461490] hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
[115544.469875] ide: failed opcode was: unknown
[115544.474388] hdc: drive not ready for command
[115544.503556] RAID1 conf printout:
[115544.507091]  --- wd:1 rd:2
[115544.510120]  disk 0, wo:0, o:1, dev:hdd1
[115544.712573] ------------[ cut here ]------------
[115544.717459] kernel BUG at mm/filemap.c:541!
[115544.721902] invalid opcode: 0000 [#1]
[115544.725806] SMP 
[115544.728015] Modules linked in: w83781d hwmon_vid i2c_isa i2c_core w83627hf_wdt
[115544.735981] CPU:    0
[115544.735983] EIP:    0060:[<c013c436>]    Not tainted VLI
[115544.735986] EFLAGS: 00010046   (2.6.18.2server #4) 
[115544.749248] EIP is at unlock_page+0xf/0x28
[115544.753589] eax: 00000000   ebx: c1f5d8c0   ecx: eab25680   edx: c1f5d8c0
[115544.760627] esi: f6a78f00   edi: 00000001   ebp: 00000000   esp: c0481e7c
[115544.767703] ds: 007b   es: 007b   ss: 0068
[115544.772033] Process swapper (pid: 0, ti=c0480000 task=c040e460 task.ti=c0480000)
[115544.779512] Stack: eab25674 c017bed6 f6a78f00 00000000 c017be7c c016069e 00000003 c2e2aa20 
[115544.788824]        c0115cf1 00000020 f6a78f00 00003400 f6a78f00 00000000 c024f8ac f7f37a00 
[115544.798171]        f7f37a00 00000001 c04fcd94 00000001 00000000 00000000 00000000 00002c00 
[115544.807683] Call Trace:
[115544.810797]  [<c017bed6>] mpage_end_io_read+0x5a/0x6e
[115544.816407]  [<c017be7c>] mpage_end_io_read+0x0/0x6e
[115544.821865]  [<c016069e>] bio_endio+0x5f/0x84
[115544.826749]  [<c0115cf1>] find_busiest_group+0x153/0x4df
[115544.832582]  [<c024f8ac>] __end_that_request_first+0x1ec/0x31c
[115544.838889]  [<c02b508a>] __ide_end_request+0x56/0xe4
[115544.844462]  [<c02b5159>] ide_end_request+0x41/0x65
[115544.849671]  [<c02bad69>] task_end_request+0x37/0x7e
[115544.854950]  [<c02baed9>] task_out_intr+0x84/0xb9
[115544.860128]  [<c01243bd>] del_timer+0x56/0x58
[115544.864839]  [<c02bae55>] task_out_intr+0x0/0xb9
[115544.869760]  [<c02b6a4b>] ide_intr+0xb7/0x132
[115544.874467]  [<c013a2f4>] handle_IRQ_event+0x26/0x59
[115544.879740]  [<c013a3ae>] __do_IRQ+0x87/0xed
[115544.884368]  [<c0105199>] do_IRQ+0x31/0x69
[115544.888843]  [<c0103746>] common_interrupt+0x1a/0x20
[115544.894124]  [<c0100c4f>] default_idle+0x35/0x5b
[115544.899233]  [<c0100d03>] cpu_idle+0x7a/0x83
[115544.903844]  [<c0486810>] start_kernel+0x16a/0x194
[115544.908950]  [<c0486286>] unknown_bootoption+0x0/0x1b6
[115544.914384] Code: e8 5e ff ff ff b9 __wake_up+0x38/0x4e
[115545.083930]  [<c01243bd>] 


-- 
Best regards,
 Jim Klimov                            mailto:klimov@2ka.mipt.ru


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-11-14 10:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-10  8:17 RAID1 submirror failure causes reboot? Jim Klimov
2006-11-10  8:41 ` Neil Brown
2006-11-10 12:53   ` Re[2]: " Jim Klimov
2006-11-13  7:17     ` Neil Brown
2006-11-13 20:11       ` Jens Axboe
2006-11-13 22:05         ` Neil Brown
2006-11-14  7:28           ` Jens Axboe
2006-11-14 10:36             ` Re[4]: " Jim Klimov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).