linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG][v4.2-rc7] Problem replacing a "missing" disk for a raid5-degraded filesystem
@ 2015-08-19 17:11 Goffredo Baroncelli
  2015-08-19 18:41 ` Omar Sandoval
  0 siblings, 1 reply; 4+ messages in thread
From: Goffredo Baroncelli @ 2015-08-19 17:11 UTC (permalink / raw)
  To: linux-btrfs

Hi all,

playing with raid5 and "btrfs replace" I found a BUG. Basically it seems that if I try to replace a "missing" disk of a "degraded" filesystem I got a kernel BUG. This is reproducible at 100% for me.

To simulate the disk removal, I started qemu and I used the command "drive_del drive-virtio-disk1". Doing so, the "removed" disk (vdj) is still present on the system, but it returns an I/O error:


$ sudo cat /dev/vdj 
cat: /dev/vdj: Input/output error

Note, if I removed the disk but I don't "unmount" and-then "mount -o degraded" the filesystem, the replace works fine.


Below the step to reproduce kernel bug.
BR
G.Baroncelli

ghigo@emulato:~$ ./btrfs --version	
btrfs-progs v4.1.2
ghigo@emulato:~$ uname -a
Linux emulato.virtual 4.2.0-rc7 #207 SMP Tue Aug 18 15:34:31 CEST 2015 x86_64 GNU/Linux

$ sudo ./mkfs.btrfs -f -M -d raid5 -m raid5 /dev/vd[efj]
SMALL VOLUME: forcing mixed metadata/data groups
btrfs-progs v4.0
See http://btrfs.wiki.kernel.org for more information.

Label:              (null)
UUID:               af4f8af5-66cb-4e46-b2fe-3d332f44eefe
Node size:          4096
Sector size:        4096
Filesystem size:    150.00GiB
Block group profiles:
  Data+Metadata:    RAID5             2.01GiB
  System:           RAID5            20.00MiB
SSD detected:       no
Incompat features:  mixed-bg, extref, raid56, skinny-metadata
Number of devices:  3
Devices:
   ID        SIZE  PATH
    1    50.00GiB  /dev/vde
    2    50.00GiB  /dev/vdf
    3    50.00GiB  /dev/vdj

$ sudo mount /dev/vde /mnt/btrfs1/
$ sudo cp -rfva /lib/modules/4.2.0-rc7/ /mnt/btrfs1/

During the copy I disconnected the disk /dev/vdj doing a
	(qemu) drive_del drive-virtio-disk1
in qemu monitor.

$ sudo umount /mnt/btrfs1/                         # <-- without these the bug
$ sudo mount -o degraded /dev/vde /mnt/btrfs1/     # <-- doesn't happen: the replace 
                                                   # <-- command works fine

$ sudo ./btrfs replace start -rf 3 /dev/vdd /mnt/btrfs1/

And then I got:

[  206.685533] BTRFS: dev_replace from <missing disk> (devid 3) to /dev/vdd started

[  206.691996] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
[  206.714714] IP: [<ffffffff8128bbb1>] bio_add_page+0x11/0x90
[  206.716173] PGD bada0067 PUD bada1067 PMD 0 
[  206.717503] Oops: 0000 [#1] SMP 
[  206.718532] Modules linked in: acpi_cpufreq processor thermal_sys cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc 9p 9pnet fscache dm_mod md_mod ext4 crc16 mbcache jbd2 crc32c_generic btrfs xor raid6_pq sr_mod cdrom sd_mod virtio_blk virtio_net ata_generic ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod
[  206.727897] CPU: 1 PID: 1938 Comm: btrfs Not tainted 4.2.0-rc7 #207
[  206.728727] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[  206.729960] task: ffff8800b96e3400 ti: ffff8800b95e0000 task.ti: ffff8800b95e0000
[  206.731012] RIP: 0010:[<ffffffff8128bbb1>]  [<ffffffff8128bbb1>] bio_add_page+0x11/0x90
[  206.732179] RSP: 0018:ffff8800b95e37f8  EFLAGS: 00010292
[  206.732905] RAX: 0000000000000000 RBX: ffff8800a08c4e00 RCX: 0000000000000000
[  206.733842] RDX: 0000000000001000 RSI: ffffea0002eb1f80 RDI: ffff88007f0f4aa8
[  206.734750] RBP: ffff8800b7e21218 R08: ffffffff817298e8 R09: 0000000000000000
[  206.735660] R10: 0000000000000800 R11: 0000000000000000 R12: ffff8800b7e21220
[  206.736615] R13: ffff8800bb155280 R14: ffff88007f1ae8c0 R15: ffff8800b7e21000
[  206.737579] FS:  00007f1dbe4388c0(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000
[  206.738713] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  206.739495] CR2: 0000000000000098 CR3: 00000000a060f000 CR4: 00000000000006e0
[  206.740420] Stack:
[  206.740793]  ffffffff00000000 ffffffffa01d8639 0000000000000050 ffff8800b95e3838
[  206.742076]  0000000000000000 ffff8800b96e3400 0000000000000000 ffffffff811951cc
[  206.761559]  0000000000c00000 0000000000000000 ffff8800bab7a800 ffff8800bb155280
[  206.763839] Call Trace:
[  206.764666]  [<ffffffffa01d8639>] ? scrub_add_page_to_rd_bio+0xa9/0x280 [btrfs]
[  206.766592]  [<ffffffff811951cc>] ? alloc_pages_current+0x8c/0xf0
[  206.768076]  [<ffffffffa01dada6>] ? scrub_pages+0x1f6/0x280 [btrfs]
[  206.769611]  [<ffffffffa01dba07>] ? scrub_stripe+0x807/0x1020 [btrfs]
[  206.771164]  [<ffffffff81153bfc>] ? __alloc_pages_nodemask+0x19c/0x9e0
[  206.772674]  [<ffffffffa01dc324>] ? scrub_chunk.isra.19+0x104/0x120 [btrfs]
[  206.773607]  [<ffffffffa01dc574>] ? scrub_enumerate_chunks+0x234/0x470 [btrfs]
[  206.774648]  [<ffffffffa01db190>] ? scrub_setup_ctx.isra.18+0x210/0x280 [btrfs]
[  206.775700]  [<ffffffffa01dde93>] ? btrfs_scrub_dev+0x1b3/0x500 [btrfs]
[  206.776625]  [<ffffffffa01823f1>] ? btrfs_commit_transaction+0x9b1/0xa90 [btrfs]
[  206.777716]  [<ffffffffa0182567>] ? start_transaction+0x97/0x580 [btrfs]
[  206.778590]  [<ffffffffa01f085c>] ? btrfs_dev_replace_start+0x33c/0x3a0 [btrfs]
[  206.779640]  [<ffffffffa01b8ef3>] ? btrfs_ioctl+0x1b33/0x26f0 [btrfs]
[  206.780489]  [<ffffffff81176bdf>] ? do_set_pte+0xcf/0x100
[  206.781246]  [<ffffffff8114babd>] ? filemap_map_pages+0x20d/0x220
[  206.782050]  [<ffffffff810621d8>] ? pte_alloc_one+0x28/0x40
[  206.782798]  [<ffffffff811794b4>] ? handle_mm_fault+0xfc4/0x15f0
[  206.783593]  [<ffffffff811bb4ce>] ? cp_new_stat+0x13e/0x160
[  206.784338]  [<ffffffff811c8c6f>] ? do_vfs_ioctl+0x28f/0x470
[  206.785204]  [<ffffffff811c8ec4>] ? SyS_ioctl+0x74/0x80
[  206.786016]  [<ffffffff815388b2>] ? entry_SYSCALL_64_fastpath+0x16/0x75
[  206.786986] Code: 83 c4 08 c3 48 83 c4 08 e9 9d fd ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 8b 47 08 4c 8b 57 20 <48> 8b 80 98 00 00 00 4c 8b 88 80 03 00 00 41 8b 81 fc 06 00 00 
[  206.808060] RIP  [<ffffffff8128bbb1>] bio_add_page+0x11/0x90
[  206.808969]  RSP <ffff8800b95e37f8>
[  206.809516] CR2: 0000000000000098
[  206.810046] ---[ end trace 4a9d0280b5ca1f36 ]---



-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG][v4.2-rc7] Problem replacing a "missing" disk for a raid5-degraded filesystem
  2015-08-19 17:11 [BUG][v4.2-rc7] Problem replacing a "missing" disk for a raid5-degraded filesystem Goffredo Baroncelli
@ 2015-08-19 18:41 ` Omar Sandoval
  2015-08-19 20:28   ` Omar Sandoval
  0 siblings, 1 reply; 4+ messages in thread
From: Omar Sandoval @ 2015-08-19 18:41 UTC (permalink / raw)
  To: kreijack; +Cc: linux-btrfs

On Wed, Aug 19, 2015 at 07:11:20PM +0200, Goffredo Baroncelli wrote:
> Hi all,
> 
> playing with raid5 and "btrfs replace" I found a BUG. Basically it seems that if I try to replace a "missing" disk of a "degraded" filesystem I got a kernel BUG. This is reproducible at 100% for me.

Hi, Goffredo, this is a known bug. I have a fix here:
http://www.spinics.net/lists/linux-btrfs/msg44874.html
I'll bug Chris to get this in for 4.3. Let me know if you feel like
testing it out and I can add your Tested-by.

Btw, just a heads up, Gmail is convinced that you're spam, it sounds
like you're failing your domain's DMARC checks.

Thanks,
-- 
Omar

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG][v4.2-rc7] Problem replacing a "missing" disk for a raid5-degraded filesystem
  2015-08-19 18:41 ` Omar Sandoval
@ 2015-08-19 20:28   ` Omar Sandoval
  2015-08-19 20:52     ` Goffredo Baroncelli
  0 siblings, 1 reply; 4+ messages in thread
From: Omar Sandoval @ 2015-08-19 20:28 UTC (permalink / raw)
  To: kreijack; +Cc: linux-btrfs

On Wed, Aug 19, 2015 at 11:41:55AM -0700, Omar Sandoval wrote:
> On Wed, Aug 19, 2015 at 07:11:20PM +0200, Goffredo Baroncelli wrote:
> > Hi all,
> > 
> > playing with raid5 and "btrfs replace" I found a BUG. Basically it seems that if I try to replace a "missing" disk of a "degraded" filesystem I got a kernel BUG. This is reproducible at 100% for me.
> 
> Hi, Goffredo, this is a known bug. I have a fix here:
> http://www.spinics.net/lists/linux-btrfs/msg44874.html
> I'll bug Chris to get this in for 4.3. Let me know if you feel like
> testing it out and I can add your Tested-by.

Ah, I didn't notice, but it looks like he already pulled it into his
integration-4.3 branch.

> Btw, just a heads up, Gmail is convinced that you're spam, it sounds
> like you're failing your domain's DMARC checks.

Thanks,
-- 
Omar

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG][v4.2-rc7] Problem replacing a "missing" disk for a raid5-degraded filesystem
  2015-08-19 20:28   ` Omar Sandoval
@ 2015-08-19 20:52     ` Goffredo Baroncelli
  0 siblings, 0 replies; 4+ messages in thread
From: Goffredo Baroncelli @ 2015-08-19 20:52 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs

On 2015-08-19 22:28, Omar Sandoval wrote:
> On Wed, Aug 19, 2015 at 11:41:55AM -0700, Omar Sandoval wrote:
>> On Wed, Aug 19, 2015 at 07:11:20PM +0200, Goffredo Baroncelli wrote:
>>> Hi all,
>>>
>>> playing with raid5 and "btrfs replace" I found a BUG. Basically it seems that if I try to replace a "missing" disk of a "degraded" filesystem I got a kernel BUG. This is reproducible at 100% for me.
>>
>> Hi, Goffredo, this is a known bug. I have a fix here:
>> http://www.spinics.net/lists/linux-btrfs/msg44874.html
>> I'll bug Chris to get this in for 4.3. Let me know if you feel like
>> testing it out and I can add your Tested-by.
> 
> Ah, I didn't notice, but it looks like he already pulled it into his
> integration-4.3 branch.

It solve; thanks for working on that so if you want you have my 

Tested-by: Goffredo Baroncelli <kreijack@inwind.it>


> 
>> Btw, just a heads up, Gmail is convinced that you're spam, it sounds
>> like you're failing your domain's DMARC checks.

It is not the first time that someone told me; I would like to use a gmail account, unfortunately gmail make a mess with the email (if I send an email to the mailing list and to me, gmail collapses the two....)

> 
> Thanks,
> 




-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-08-19 20:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-19 17:11 [BUG][v4.2-rc7] Problem replacing a "missing" disk for a raid5-degraded filesystem Goffredo Baroncelli
2015-08-19 18:41 ` Omar Sandoval
2015-08-19 20:28   ` Omar Sandoval
2015-08-19 20:52     ` Goffredo Baroncelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).