* Re: [Bug 100491] New: Oops under bitmap_start_sync [md_mod] at boot
[not found] <12832_1435266173_558C6C7D_12832_48_1_bug-100491-1606@https.bugzilla.kernel.org/>
@ 2015-06-28 20:53 ` Sami Liedes
2015-06-29 12:28 ` Austin S Hemmelgarn
0 siblings, 1 reply; 2+ messages in thread
From: Sami Liedes @ 2015-06-28 20:53 UTC (permalink / raw)
To: Goldwyn Rodrigues, Neil Brown, bugzilla-daemon; +Cc: linux-raid, linux-kernel
On Thu, Jun 25, 2015 at 09:02:45PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=100491
>
> Bug ID: 100491
> Summary: Oops under bitmap_start_sync [md_mod] at boot
[...]
> Reading all physical valumes. This may take a while...
> Found volume group "rootvg" using metadata type lvm2
> device-mapper: raid: Device 0 specified for rebuild: Clearing superblock
> md/raid1:mdX: active with 1 out of 2 mirrors
> mdX: invalid bitmap file superblock: bad magic
> md-cluster module not found.
> mdX: Could not setup cluster service (256)
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
> IP: [<ffffffff8159e4a9>] _raw_spin_lock_irq+0x29/0x70
> PGD 0
> Oops: 0002 [#1] PREEMPT SMP
[...]
I'm marking this as a regression in bugzilla, since this seems to
prevent booting on 4.1.0 at least in certain circumstances (namely
those which I have; I wonder if any raid1 recovery works?) while 4.0.6
boots correctly.
I bisected this down to one of four commits. Well, assuming that the
problem was caused by changes in drivers/md; a fair assumption, I
think. The commits are:
$ git bisect view --oneline
f9209a3 bitmap_create returns bitmap pointer
96ae923 Gather on-going resync information of other nodes
54519c5 Lock bitmap while joining the cluster
b97e9257 Use separate bitmaps for each nodes in the cluster
The crash happens whether or not CONFIG_MD_CLUSTER is enabled.
Here's the versions I tested:
git bisect start '--' 'drivers/md'
# bad: [b953c0d234bc72e8489d3bf51a276c5c4ec85345] Linux 4.1
# good: [39a8804455fb23f09157341d3ba7db6d7ae6ee76] Linux 4.0
# bad: [9ffc8f7cb9647b13dfe4d1ad0d5e1427bb8b46d6] md/raid5: don't do chunk aligned read on degraded array.
# bad: [6dc69c9c460b0cf05b5b3f323a8b944a2e52e76d] md: recover_bitmaps() can be static
# bad: [4b26a08af92c0d9c0bce07612b56ff326112321a] Perform resync for cluster node failure
# good: [cf921cc19cf7c1e99f730a2faa02d80817d684a2] Add node recovery callbacks
# skip: [96ae923ab659e37dd5fc1e05ecbf654e2f94bcbe] Gather on-going resync information of other nodes
# bad: [f9209a323547f054c7439a3bf67c45e64a054bdd] bitmap_create returns bitmap pointer
# skip: [54519c5f4b398bcfe599f652b4ef4004d5fa63ff] Lock bitmap while joining the cluster
Sami
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [Bug 100491] New: Oops under bitmap_start_sync [md_mod] at boot
2015-06-28 20:53 ` [Bug 100491] New: Oops under bitmap_start_sync [md_mod] at boot Sami Liedes
@ 2015-06-29 12:28 ` Austin S Hemmelgarn
0 siblings, 0 replies; 2+ messages in thread
From: Austin S Hemmelgarn @ 2015-06-29 12:28 UTC (permalink / raw)
To: Sami Liedes, Goldwyn Rodrigues, Neil Brown, bugzilla-daemon
Cc: linux-raid, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1831 bytes --]
On 2015-06-28 16:53, Sami Liedes wrote:
> On Thu, Jun 25, 2015 at 09:02:45PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=100491
>>
>> Bug ID: 100491
>> Summary: Oops under bitmap_start_sync [md_mod] at boot
> [...]
>> Reading all physical valumes. This may take a while...
>> Found volume group "rootvg" using metadata type lvm2
>> device-mapper: raid: Device 0 specified for rebuild: Clearing superblock
>> md/raid1:mdX: active with 1 out of 2 mirrors
>> mdX: invalid bitmap file superblock: bad magic
>> md-cluster module not found.
>> mdX: Could not setup cluster service (256)
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
>> IP: [<ffffffff8159e4a9>] _raw_spin_lock_irq+0x29/0x70
>> PGD 0
>> Oops: 0002 [#1] PREEMPT SMP
> [...]
>
> I'm marking this as a regression in bugzilla, since this seems to
> prevent booting on 4.1.0 at least in certain circumstances (namely
> those which I have; I wonder if any raid1 recovery works?) while 4.0.6
> boots correctly.
I can confirm having the same issue with the MD code being used through
dm-raid.
>
> I bisected this down to one of four commits. Well, assuming that the
> problem was caused by changes in drivers/md; a fair assumption, I
> think. The commits are:
>
> $ git bisect view --oneline
> f9209a3 bitmap_create returns bitmap pointer
> 96ae923 Gather on-going resync information of other nodes
> 54519c5 Lock bitmap while joining the cluster
> b97e9257 Use separate bitmaps for each nodes in the cluster
My own bisect turned up the same set of commits, although I wouldn't
have the time to go any further with it until next weekend.
>
> The crash happens whether or not CONFIG_MD_CLUSTER is enabled.
Again, same here.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-06-29 12:28 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <12832_1435266173_558C6C7D_12832_48_1_bug-100491-1606@https.bugzilla.kernel.org/>
2015-06-28 20:53 ` [Bug 100491] New: Oops under bitmap_start_sync [md_mod] at boot Sami Liedes
2015-06-29 12:28 ` Austin S Hemmelgarn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).