linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* btrfs on bcache
@ 2013-12-18 17:17 eb
  2013-12-19 19:04 ` Fábio Pfeifer
  2013-12-19 19:59 ` Chris Mason
  0 siblings, 2 replies; 20+ messages in thread
From: eb @ 2013-12-18 17:17 UTC (permalink / raw)
  To: linux-btrfs

I've recently setup a system (Kernel 3.12.5-1-ARCH) which is layered as follows:

/dev/sdb3 - cache0 (80 GB Intel SSD)
/dev/sdc1 - backing device (2 TB WD HDD)

sdb3+sdc1 => /dev/bcache0

On /dev/bcache0, there's a btrfs filesystem with 2 subvolumes, mounted
as / and /home. What's been bothering me are the following entries in
my kernel log:

[13811.845540] incomplete page write in btrfs with offset 1536 and length 2560
[13870.326639] incomplete page write in btrfs with offset 3072 and length 1024

The offset/length values are always either 1536/2560 or 3072/1024,
they sum up nicely to 4K. There are 607 of those in there as I am
writing this, the machine has been up 18 hours and been under no
particular I/O strain (it's a desktop).

Trying to fix this, I unattached the cache (still using /dev/bcache0,
but without /dev/sdb3 attached), causing these errors to disappear. As
soon as I re-attached /dev/sdb3 they started again, so I am fairly
sure it's an unfavorable interaction between bcache and btrfs.

Is this something I should be worried about (they're only emitted with
KERN_INFO?) or just an alignment problem? The underlying HDD is using
4K-Sectors, while the block_size of bcache seems to be 512, could that
be the issue here?

I've also encountered incomplete reads and a few csum errors, but I
have not been able to trigger these regularly. I have a feeling that
the error is more likely  o be on the bcache end (I've mailed to that
list as well), however any insight into the matter would be much
appreciated.

Thanks,

- eb

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2013-12-18 17:17 eb
@ 2013-12-19 19:04 ` Fábio Pfeifer
  2013-12-19 19:05   ` Fábio Pfeifer
  2013-12-20 22:26   ` Henry de Valence
  2013-12-19 19:59 ` Chris Mason
  1 sibling, 2 replies; 20+ messages in thread
From: Fábio Pfeifer @ 2013-12-19 19:04 UTC (permalink / raw)
  To: linux-btrfs

Any update on this?

I have here exactly the same issue. Kernel 3.12.5-1-ARCH, backing
device 500 GB IDE, cache 24 GB SSD => /dev/bcache0
On /dev/bcache I also have 2 subvolumes, / and /home. I get lots of
messages in dmesg:

(...)
[   22.282469] BTRFS info (device bcache0): csum failed ino 56193 off
212992 csum 519977505 expected csum 3166125439
[   22.282656] incomplete page read in btrfs with offset 1024 and length 3072
[   23.370872] incomplete page read in btrfs with offset 1024 and length 3072
[   23.370890] BTRFS info (device bcache0): csum failed ino 57765 off
106496 csum 3553846164 expected csum 1299185721
[   23.505238] incomplete page read in btrfs with offset 2560 and length 1536
[   23.505256] BTRFS info (device bcache0): csum failed ino 75922 off
172032 csum 1883678196 expected csum 1337496676
[   23.508535] incomplete page read in btrfs with offset 2560 and length 1536
[   23.508547] BTRFS info (device bcache0): csum failed ino 74368 off
237568 csum 2863587994 expected csum 2693116460
[   25.683059] incomplete page read in btrfs with offset 2560 and length 1536
[   25.683078] BTRFS info (device bcache0): csum failed ino 123709 off
57344 csum 1528117893 expected csum 2239543273
[   25.684339] incomplete page read in btrfs with offset 1024 and length 3072
[   26.622384] incomplete page read in btrfs with offset 1024 and length 3072
[   26.906718] incomplete page read in btrfs with offset 2560 and length 1536
[   27.823247] incomplete page read in btrfs with offset 1024 and length 3072
[   27.823265] btrfs_readpage_end_io_hook: 2 callbacks suppressed
[   27.823271] BTRFS info (device bcache0): csum failed ino 34587 off
16384 csum 1180114025 expected csum 474262911
[   28.490066] incomplete page read in btrfs with offset 2560 and length 1536
[   28.490085] BTRFS info (device bcache0): csum failed ino 65817 off
327680 csum 3065880108 expected csum 2663659117
[   29.413824] incomplete page read in btrfs with offset 1024 and length 3072
[   41.913857] incomplete page read in btrfs with offset 2560 and length 1536
[   55.761753] incomplete page read in btrfs with offset 1024 and length 3072
[   55.761771] BTRFS info (device bcache0): csum failed ino 72835 off
81920 csum 1511792656 expected csum 3733709121
[   69.636498] incomplete page read in btrfs with offset 2560 and length 1536
(...)

should I be worried?

thanks,

Fabio Pfeifer

2013/12/18 eb <eab@gmx.ch>:
> I've recently setup a system (Kernel 3.12.5-1-ARCH) which is layered as follows:
>
> /dev/sdb3 - cache0 (80 GB Intel SSD)
> /dev/sdc1 - backing device (2 TB WD HDD)
>
> sdb3+sdc1 => /dev/bcache0
>
> On /dev/bcache0, there's a btrfs filesystem with 2 subvolumes, mounted
> as / and /home. What's been bothering me are the following entries in
> my kernel log:
>
> [13811.845540] incomplete page write in btrfs with offset 1536 and length 2560
> [13870.326639] incomplete page write in btrfs with offset 3072 and length 1024
>
> The offset/length values are always either 1536/2560 or 3072/1024,
> they sum up nicely to 4K. There are 607 of those in there as I am
> writing this, the machine has been up 18 hours and been under no
> particular I/O strain (it's a desktop).
>
> Trying to fix this, I unattached the cache (still using /dev/bcache0,
> but without /dev/sdb3 attached), causing these errors to disappear. As
> soon as I re-attached /dev/sdb3 they started again, so I am fairly
> sure it's an unfavorable interaction between bcache and btrfs.
>
> Is this something I should be worried about (they're only emitted with
> KERN_INFO?) or just an alignment problem? The underlying HDD is using
> 4K-Sectors, while the block_size of bcache seems to be 512, could that
> be the issue here?
>
> I've also encountered incomplete reads and a few csum errors, but I
> have not been able to trigger these regularly. I have a feeling that
> the error is more likely  o be on the bcache end (I've mailed to that
> list as well), however any insight into the matter would be much
> appreciated.
>
> Thanks,
>
> - eb
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2013-12-19 19:04 ` Fábio Pfeifer
@ 2013-12-19 19:05   ` Fábio Pfeifer
  2013-12-20 22:26   ` Henry de Valence
  1 sibling, 0 replies; 20+ messages in thread
From: Fábio Pfeifer @ 2013-12-19 19:05 UTC (permalink / raw)
  To: linux-btrfs

Forgot to mention: bcache is in writeback mode

2013/12/19 Fábio Pfeifer <fmpfeifer@gmail.com>:
> Any update on this?
>
> I have here exactly the same issue. Kernel 3.12.5-1-ARCH, backing
> device 500 GB IDE, cache 24 GB SSD => /dev/bcache0
> On /dev/bcache I also have 2 subvolumes, / and /home. I get lots of
> messages in dmesg:
>
> (...)
> [   22.282469] BTRFS info (device bcache0): csum failed ino 56193 off
> 212992 csum 519977505 expected csum 3166125439
> [   22.282656] incomplete page read in btrfs with offset 1024 and length 3072
> [   23.370872] incomplete page read in btrfs with offset 1024 and length 3072
> [   23.370890] BTRFS info (device bcache0): csum failed ino 57765 off
> 106496 csum 3553846164 expected csum 1299185721
> [   23.505238] incomplete page read in btrfs with offset 2560 and length 1536
> [   23.505256] BTRFS info (device bcache0): csum failed ino 75922 off
> 172032 csum 1883678196 expected csum 1337496676
> [   23.508535] incomplete page read in btrfs with offset 2560 and length 1536
> [   23.508547] BTRFS info (device bcache0): csum failed ino 74368 off
> 237568 csum 2863587994 expected csum 2693116460
> [   25.683059] incomplete page read in btrfs with offset 2560 and length 1536
> [   25.683078] BTRFS info (device bcache0): csum failed ino 123709 off
> 57344 csum 1528117893 expected csum 2239543273
> [   25.684339] incomplete page read in btrfs with offset 1024 and length 3072
> [   26.622384] incomplete page read in btrfs with offset 1024 and length 3072
> [   26.906718] incomplete page read in btrfs with offset 2560 and length 1536
> [   27.823247] incomplete page read in btrfs with offset 1024 and length 3072
> [   27.823265] btrfs_readpage_end_io_hook: 2 callbacks suppressed
> [   27.823271] BTRFS info (device bcache0): csum failed ino 34587 off
> 16384 csum 1180114025 expected csum 474262911
> [   28.490066] incomplete page read in btrfs with offset 2560 and length 1536
> [   28.490085] BTRFS info (device bcache0): csum failed ino 65817 off
> 327680 csum 3065880108 expected csum 2663659117
> [   29.413824] incomplete page read in btrfs with offset 1024 and length 3072
> [   41.913857] incomplete page read in btrfs with offset 2560 and length 1536
> [   55.761753] incomplete page read in btrfs with offset 1024 and length 3072
> [   55.761771] BTRFS info (device bcache0): csum failed ino 72835 off
> 81920 csum 1511792656 expected csum 3733709121
> [   69.636498] incomplete page read in btrfs with offset 2560 and length 1536
> (...)
>
> should I be worried?
>
> thanks,
>
> Fabio Pfeifer
>
> 2013/12/18 eb <eab@gmx.ch>:
>> I've recently setup a system (Kernel 3.12.5-1-ARCH) which is layered as follows:
>>
>> /dev/sdb3 - cache0 (80 GB Intel SSD)
>> /dev/sdc1 - backing device (2 TB WD HDD)
>>
>> sdb3+sdc1 => /dev/bcache0
>>
>> On /dev/bcache0, there's a btrfs filesystem with 2 subvolumes, mounted
>> as / and /home. What's been bothering me are the following entries in
>> my kernel log:
>>
>> [13811.845540] incomplete page write in btrfs with offset 1536 and length 2560
>> [13870.326639] incomplete page write in btrfs with offset 3072 and length 1024
>>
>> The offset/length values are always either 1536/2560 or 3072/1024,
>> they sum up nicely to 4K. There are 607 of those in there as I am
>> writing this, the machine has been up 18 hours and been under no
>> particular I/O strain (it's a desktop).
>>
>> Trying to fix this, I unattached the cache (still using /dev/bcache0,
>> but without /dev/sdb3 attached), causing these errors to disappear. As
>> soon as I re-attached /dev/sdb3 they started again, so I am fairly
>> sure it's an unfavorable interaction between bcache and btrfs.
>>
>> Is this something I should be worried about (they're only emitted with
>> KERN_INFO?) or just an alignment problem? The underlying HDD is using
>> 4K-Sectors, while the block_size of bcache seems to be 512, could that
>> be the issue here?
>>
>> I've also encountered incomplete reads and a few csum errors, but I
>> have not been able to trigger these regularly. I have a feeling that
>> the error is more likely  o be on the bcache end (I've mailed to that
>> list as well), however any insight into the matter would be much
>> appreciated.
>>
>> Thanks,
>>
>> - eb
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2013-12-18 17:17 eb
  2013-12-19 19:04 ` Fábio Pfeifer
@ 2013-12-19 19:59 ` Chris Mason
  2013-12-20 12:36   ` eb
  2013-12-20 12:42   ` Fábio Pfeifer
  1 sibling, 2 replies; 20+ messages in thread
From: Chris Mason @ 2013-12-19 19:59 UTC (permalink / raw)
  To: eab@gmx.ch; +Cc: linux-btrfs@vger.kernel.org

On Wed, 2013-12-18 at 18:17 +0100, eb wrote:
> I've recently setup a system (Kernel 3.12.5-1-ARCH) which is layered as follows:
> 
> /dev/sdb3 - cache0 (80 GB Intel SSD)
> /dev/sdc1 - backing device (2 TB WD HDD)
> 
> sdb3+sdc1 => /dev/bcache0
> 
> On /dev/bcache0, there's a btrfs filesystem with 2 subvolumes, mounted
> as / and /home. What's been bothering me are the following entries in
> my kernel log:
> 
> [13811.845540] incomplete page write in btrfs with offset 1536 and length 2560
> [13870.326639] incomplete page write in btrfs with offset 3072 and length 1024
> 
> The offset/length values are always either 1536/2560 or 3072/1024,
> they sum up nicely to 4K. There are 607 of those in there as I am
> writing this, the machine has been up 18 hours and been under no
> particular I/O strain (it's a desktop).

Btrfs shouldn't be setting the offset on the bios.  Are you able to add
a WARN_ON to the message that prints this so we can see the stack trace?

Could you please cc the bcache and btrfs list together?

-chris


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2013-12-19 19:59 ` Chris Mason
@ 2013-12-20 12:36   ` eb
  2013-12-20 12:42   ` Fábio Pfeifer
  1 sibling, 0 replies; 20+ messages in thread
From: eb @ 2013-12-20 12:36 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-bcache, linux-btrfs

On Thu, Dec 19, 2013 at 8:59 PM, Chris Mason <clm@fb.com> wrote:
> On Wed, 2013-12-18 at 18:17 +0100, eb wrote:
> Btrfs shouldn't be setting the offset on the bios.  Are you able to add
> a WARN_ON to the message that prints this so we can see the stack trace?

If you send me a patch - my experience on hacking on the kernel is
exactly 0 - I'll try to see if I can compile a custom kernel and get
it running.

> Could you please cc the bcache and btrfs list together?

Done.

I did some more testing - I copied an image of a 128GB drive over the
network (via netcat) onto the bcache/btrfs system and verified the
results twice using sha1sum. They're both identical on the source
system (which is *not* using bcache) and bcache/btrfs setup. I've
gotten a lot of the incomplete write errors and a few csum erros in
dmesg, but apparently they haven't done any harm?

Not sure how remarkable this is, as these kinds of things are supposed
to bypass the cache anyway, but I assume they still have to go through
the subsystem.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2013-12-19 19:59 ` Chris Mason
  2013-12-20 12:36   ` eb
@ 2013-12-20 12:42   ` Fábio Pfeifer
  2013-12-20 15:46     ` Chris Mason
  1 sibling, 1 reply; 20+ messages in thread
From: Fábio Pfeifer @ 2013-12-20 12:42 UTC (permalink / raw)
  To: Chris Mason; +Cc: eab@gmx.ch, linux-btrfs@vger.kernel.org, linux-bcache

Hello,

I put the "WARN_ON(1);" after the printk lines (incomplete page read
and incomplete page write) in extent_io.c.

here some call traces:

[   19.509497] incomplete page read in btrfs with offset 2560 and length 1536
[   19.509500] ------------[ cut here ]------------
[   19.509528] WARNING: CPU: 2 PID: 220 at fs/btrfs/extent_io.c:2441
end_bio_extent_readpage+0x788/0xc20 [btrfs]()
[   19.509530] Modules linked in: cdc_acm fuse iTCO_wdt
iTCO_vendor_support snd_hda_codec_analog coretemp kvm_intel kvm raid1
ext4 crc16 md_mod mbcache jbd2 microcode nvidia(PO) psmouse pcspkr
evdev serio_raw i2c_i801 lpc_ich i2c_core snd_hda_intel sky2 skge
i82975x_edac button asus_atk0110 snd_hda_codec snd_hwdep shpchp
snd_pcm snd_page_alloc snd_timer acpi_cpufreq snd edac_core soundcore
processor vboxdrv(O) sr_mod cdrom ata_generic pata_acpi hid_generic
usbhid hid usb_storage sd_mod pata_marvell firewire_ohci uhci_hcd ahci
ehci_pci firewire_core ata_piix libahci crc_itu_t ehci_hcd libata
scsi_mod usbcore usb_common btrfs crc32c libcrc32c xor raid6_pq bcache
[   19.509578] CPU: 2 PID: 220 Comm: btrfs-endio-met Tainted: P
W  O 3.12.5-1-ARCH #1
[   19.509580] Hardware name: System manufacturer System Product
Name/P5WDG2 WS Pro, BIOS 0905    03/06/2008
[   19.509581]  0000000000000009 ffff880231a63cb0 ffffffff814ee37b
0000000000000000
[   19.509585]  ffff880231a63ce8 ffffffff81062bcd ffffea00085eaec0
0000000000000000
[   19.509587]  ffff8802320cc9c0 0000000000000000 ffff880233b0e000
ffff880231a63cf8
[   19.509590] Call Trace:
[   19.509596]  [<ffffffff814ee37b>] dump_stack+0x54/0x8d
[   19.509601]  [<ffffffff81062bcd>] warn_slowpath_common+0x7d/0xa0
[   19.509603]  [<ffffffff81062caa>] warn_slowpath_null+0x1a/0x20
[   19.509614]  [<ffffffffa00b7ba8>] end_bio_extent_readpage+0x788/0xc20 [btrfs]
[   19.509617]  [<ffffffff8107010b>] ? lock_timer_base.isra.35+0x2b/0x50
[   19.509619]  [<ffffffff8106f660>] ? detach_if_pending+0x120/0x120
[   19.509623]  [<ffffffff811d98dd>] bio_endio+0x1d/0x30
[   19.509632]  [<ffffffffa0090227>] end_workqueue_fn+0x37/0x40 [btrfs]
[   19.509642]  [<ffffffffa00c6b1e>] worker_loop+0x14e/0x560 [btrfs]
[   19.509646]  [<ffffffff810952b2>] ? default_wake_function+0x12/0x20
[   19.509656]  [<ffffffffa00c69d0>] ? btrfs_queue_worker+0x330/0x330 [btrfs]
[   19.509672]  [<ffffffff81084fe0>] kthread+0xc0/0xd0
[   19.509677]  [<ffffffff81084f20>] ? kthread_create_on_node+0x120/0x120
[   19.509680]  [<ffffffff814fce7c>] ret_from_fork+0x7c/0xb0
[   19.509683]  [<ffffffff81084f20>] ? kthread_create_on_node+0x120/0x120
[   19.509687] ---[ end trace bbc8d0d088375446 ]---
[   25.592100] incomplete page read in btrfs with offset 2560 and length 1536
[   25.592105] ------------[ cut here ]------------
[   25.592141] WARNING: CPU: 0 PID: 442 at fs/btrfs/extent_io.c:2441
end_bio_extent_readpage+0x788/0xc20 [btrfs]()
[   25.592143] Modules linked in: cdc_acm fuse iTCO_wdt
iTCO_vendor_support snd_hda_codec_analog coretemp kvm_intel kvm raid1
ext4 crc16 md_mod mbcache jbd2 microcode nvidia(PO) psmouse pcspkr
evdev serio_raw i2c_i801 lpc_ich i2c_core snd_hda_intel sky2 skge
i82975x_edac button asus_atk0110 snd_hda_codec snd_hwdep shpchp
snd_pcm snd_page_alloc snd_timer acpi_cpufreq snd edac_core soundcore
processor vboxdrv(O) sr_mod cdrom ata_generic pata_acpi hid_generic
usbhid hid usb_storage sd_mod pata_marvell firewire_ohci uhci_hcd ahci
ehci_pci firewire_core ata_piix libahci crc_itu_t ehci_hcd libata
scsi_mod usbcore usb_common btrfs crc32c libcrc32c xor raid6_pq bcache
[   25.592205] CPU: 0 PID: 442 Comm: btrfs-endio-met Tainted: P
W  O 3.12.5-1-ARCH #1
[   25.592208] Hardware name: System manufacturer System Product
Name/P5WDG2 WS Pro, BIOS 0905    03/06/2008
[   25.592211]  0000000000000009 ffff880229773cb0 ffffffff814ee37b
0000000000000000
[   25.592216]  ffff880229773ce8 ffffffff81062bcd ffffea0002a20a80
0000000000000000
[   25.592220]  ffff88022d3ab180 0000000000000000 ffff88022d326000
ffff880229773cf8
[   25.592225] Call Trace:
[   25.592234]  [<ffffffff814ee37b>] dump_stack+0x54/0x8d
[   25.592240]  [<ffffffff81062bcd>] warn_slowpath_common+0x7d/0xa0
[   25.592245]  [<ffffffff81062caa>] warn_slowpath_null+0x1a/0x20
[   25.592262]  [<ffffffffa00b7ba8>] end_bio_extent_readpage+0x788/0xc20 [btrfs]
[   25.592267]  [<ffffffff810701ef>] ? try_to_del_timer_sync+0x4f/0x70
[   25.592271]  [<ffffffff81070262>] ? del_timer_sync+0x52/0x60
[   25.592275]  [<ffffffff8106f660>] ? detach_if_pending+0x120/0x120
[   25.592280]  [<ffffffff811d98dd>] bio_endio+0x1d/0x30
[   25.592296]  [<ffffffffa0090227>] end_workqueue_fn+0x37/0x40 [btrfs]
[   25.592312]  [<ffffffffa00c6b1e>] worker_loop+0x14e/0x560 [btrfs]
[   25.592318]  [<ffffffff810952b2>] ? default_wake_function+0x12/0x20
[   25.592335]  [<ffffffffa00c69d0>] ? btrfs_queue_worker+0x330/0x330 [btrfs]
[   25.592350]  [<ffffffff81084fe0>] kthread+0xc0/0xd0
[   25.592353]  [<ffffffff81084f20>] ? kthread_create_on_node+0x120/0x120
[   25.592356]  [<ffffffff814fce7c>] ret_from_fork+0x7c/0xb0
[   25.592359]  [<ffffffff81084f20>] ? kthread_create_on_node+0x120/0x120
[   25.592360] ---[ end trace bbc8d0d088375447 ]---

thanks,

Fabio Pfeifer

2013/12/19 Chris Mason <clm@fb.com>:
> On Wed, 2013-12-18 at 18:17 +0100, eb wrote:
>> I've recently setup a system (Kernel 3.12.5-1-ARCH) which is layered as follows:
>>
>> /dev/sdb3 - cache0 (80 GB Intel SSD)
>> /dev/sdc1 - backing device (2 TB WD HDD)
>>
>> sdb3+sdc1 => /dev/bcache0
>>
>> On /dev/bcache0, there's a btrfs filesystem with 2 subvolumes, mounted
>> as / and /home. What's been bothering me are the following entries in
>> my kernel log:
>>
>> [13811.845540] incomplete page write in btrfs with offset 1536 and length 2560
>> [13870.326639] incomplete page write in btrfs with offset 3072 and length 1024
>>
>> The offset/length values are always either 1536/2560 or 3072/1024,
>> they sum up nicely to 4K. There are 607 of those in there as I am
>> writing this, the machine has been up 18 hours and been under no
>> particular I/O strain (it's a desktop).
>
> Btrfs shouldn't be setting the offset on the bios.  Are you able to add
> a WARN_ON to the message that prints this so we can see the stack trace?
>
> Could you please cc the bcache and btrfs list together?
>
> -chris
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2013-12-20 12:42   ` Fábio Pfeifer
@ 2013-12-20 15:46     ` Chris Mason
  2013-12-24 16:44       ` Fábio Pfeifer
  2014-01-06 23:37       ` Kent Overstreet
  0 siblings, 2 replies; 20+ messages in thread
From: Chris Mason @ 2013-12-20 15:46 UTC (permalink / raw)
  To: fmpfeifer@gmail.com
  Cc: linux-btrfs@vger.kernel.org, eab@gmx.ch,
	linux-bcache@vger.kernel.org

On Fri, 2013-12-20 at 10:42 -0200, Fábio Pfeifer wrote:
> Hello,
> 
> I put the "WARN_ON(1);" after the printk lines (incomplete page read
> and incomplete page write) in extent_io.c.
> 
> here some call traces:
> 
> [   19.509497] incomplete page read in btrfs with offset 2560 and length 1536
> [   19.509500] ------------[ cut here ]------------
> [   19.509528] WARNING: CPU: 2 PID: 220 at fs/btrfs/extent_io.c:2441
> end_bio_extent_readpage+0x788/0xc20 [btrfs]()
> [   19.509530] Modules linked in: cdc_acm fuse iTCO_wdt
> iTCO_vendor_support snd_hda_codec_analog coretemp kvm_intel kvm raid1
> ext4 crc16 md_mod mbcache jbd2 microcode nvidia(PO) psmouse pcspkr
> evdev serio_raw i2c_i801 lpc_ich i2c_core snd_hda_intel sky2 skge
> i82975x_edac button asus_atk0110 snd_hda_codec snd_hwdep shpchp
> snd_pcm snd_page_alloc snd_timer acpi_cpufreq snd edac_core soundcore
> processor vboxdrv(O) sr_mod cdrom ata_generic pata_acpi hid_generic
> usbhid hid usb_storage sd_mod pata_marvell firewire_ohci uhci_hcd ahci
> ehci_pci firewire_core ata_piix libahci crc_itu_t ehci_hcd libata
> scsi_mod usbcore usb_common btrfs crc32c libcrc32c xor raid6_pq bcache
> [   19.509578] CPU: 2 PID: 220 Comm: btrfs-endio-met Tainted: P
> W  O 3.12.5-1-ARCH #1
> [   19.509580] Hardware name: System manufacturer System Product
> Name/P5WDG2 WS Pro, BIOS 0905    03/06/2008
> [   19.509581]  0000000000000009 ffff880231a63cb0 ffffffff814ee37b
> 0000000000000000
> [   19.509585]  ffff880231a63ce8 ffffffff81062bcd ffffea00085eaec0
> 0000000000000000
> [   19.509587]  ffff8802320cc9c0 0000000000000000 ffff880233b0e000
> ffff880231a63cf8
> [   19.509590] Call Trace:
> [   19.509596]  [<ffffffff814ee37b>] dump_stack+0x54/0x8d
> [   19.509601]  [<ffffffff81062bcd>] warn_slowpath_common+0x7d/0xa0
> [   19.509603]  [<ffffffff81062caa>] warn_slowpath_null+0x1a/0x20
> [   19.509614]  [<ffffffffa00b7ba8>] end_bio_extent_readpage+0x788/0xc20 [btrfs]

This should mean that bcache is either failing to read some blocks
properly or is fiddling with the bv_len/bv_offset fields.

Could someone from bcache comment?

-chris


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2013-12-19 19:04 ` Fábio Pfeifer
  2013-12-19 19:05   ` Fábio Pfeifer
@ 2013-12-20 22:26   ` Henry de Valence
  1 sibling, 0 replies; 20+ messages in thread
From: Henry de Valence @ 2013-12-20 22:26 UTC (permalink / raw)
  To: linux-btrfs

On Thu, Dec 19, 2013 at 2:04 PM, Fábio Pfeifer <fmpfeifer@gmail.com> wrote:
> Any update on this?
>
> I have here exactly the same issue. Kernel 3.12.5-1-ARCH, backing
> device 500 GB IDE, cache 24 GB SSD => /dev/bcache0
> On /dev/bcache I also have 2 subvolumes, / and /home. I get lots of
> messages in dmesg:

I also have this issue.

Also, this afternoon I experienced data corruption on my btrfs device
(checksum errors), which might or might not be related. I don't really
know how to determine the cause, but if anyone has suggestions they'd
be appreciated.

Cheers,
Henry de Valence

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2013-12-20 15:46     ` Chris Mason
@ 2013-12-24 16:44       ` Fábio Pfeifer
  2014-01-06 23:37       ` Kent Overstreet
  1 sibling, 0 replies; 20+ messages in thread
From: Fábio Pfeifer @ 2013-12-24 16:44 UTC (permalink / raw)
  To: linux-btrfs@vger.kernel.org; +Cc: linux-bcache@vger.kernel.org

(resend int text only)
Some more information about this issue.

I installed my system last november (arch x86_64), with kernel 3.11.
That time I didn't see any csum error or
"incomplete page read" error. Some time later these errors started to
show up. I don't know exactly if it was in
3.11 -> 3.12 upgrade or somewhere in the 3.12 cycle. I've been using
bcache in writeback mode from the beginning.

I made some more testing:
  - tryed bcache in writethrough, writearound  and none modes;
  - tryed linux kernel 3.13-rc5

The errors didn't go away (maybe because my filesystem is already
corrupted). I didn't have time to test with kernel 3.11 again.

But lately the errors increased, and it started to make my system
unstable, and then unusable.
I had to reformat everything and recover my backups.

I don't have my / and /home in btrfs over bcache anymore, but I can
make some tests in a spare HD and SSD i have here. I'll report back
after Christmas.

thanks,

Fabio

2013/12/20 Chris Mason <clm@fb.com>:
> On Fri, 2013-12-20 at 10:42 -0200, Fábio Pfeifer wrote:
>> Hello,
>>
>> I put the "WARN_ON(1);" after the printk lines (incomplete page read
>> and incomplete page write) in extent_io.c.
>>
>> here some call traces:
>>
>> [   19.509497] incomplete page read in btrfs with offset 2560 and length 1536
>> [   19.509500] ------------[ cut here ]------------
>> [   19.509528] WARNING: CPU: 2 PID: 220 at fs/btrfs/extent_io.c:2441
>> end_bio_extent_readpage+0x788/0xc20 [btrfs]()
>> [   19.509530] Modules linked in: cdc_acm fuse iTCO_wdt
>> iTCO_vendor_support snd_hda_codec_analog coretemp kvm_intel kvm raid1
>> ext4 crc16 md_mod mbcache jbd2 microcode nvidia(PO) psmouse pcspkr
>> evdev serio_raw i2c_i801 lpc_ich i2c_core snd_hda_intel sky2 skge
>> i82975x_edac button asus_atk0110 snd_hda_codec snd_hwdep shpchp
>> snd_pcm snd_page_alloc snd_timer acpi_cpufreq snd edac_core soundcore
>> processor vboxdrv(O) sr_mod cdrom ata_generic pata_acpi hid_generic
>> usbhid hid usb_storage sd_mod pata_marvell firewire_ohci uhci_hcd ahci
>> ehci_pci firewire_core ata_piix libahci crc_itu_t ehci_hcd libata
>> scsi_mod usbcore usb_common btrfs crc32c libcrc32c xor raid6_pq bcache
>> [   19.509578] CPU: 2 PID: 220 Comm: btrfs-endio-met Tainted: P
>> W  O 3.12.5-1-ARCH #1
>> [   19.509580] Hardware name: System manufacturer System Product
>> Name/P5WDG2 WS Pro, BIOS 0905    03/06/2008
>> [   19.509581]  0000000000000009 ffff880231a63cb0 ffffffff814ee37b
>> 0000000000000000
>> [   19.509585]  ffff880231a63ce8 ffffffff81062bcd ffffea00085eaec0
>> 0000000000000000
>> [   19.509587]  ffff8802320cc9c0 0000000000000000 ffff880233b0e000
>> ffff880231a63cf8
>> [   19.509590] Call Trace:
>> [   19.509596]  [<ffffffff814ee37b>] dump_stack+0x54/0x8d
>> [   19.509601]  [<ffffffff81062bcd>] warn_slowpath_common+0x7d/0xa0
>> [   19.509603]  [<ffffffff81062caa>] warn_slowpath_null+0x1a/0x20
>> [   19.509614]  [<ffffffffa00b7ba8>] end_bio_extent_readpage+0x788/0xc20 [btrfs]
>
> This should mean that bcache is either failing to read some blocks
> properly or is fiddling with the bv_len/bv_offset fields.
>
> Could someone from bcache comment?
>
> -chris
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2013-12-20 15:46     ` Chris Mason
  2013-12-24 16:44       ` Fábio Pfeifer
@ 2014-01-06 23:37       ` Kent Overstreet
  2014-01-08 19:35         ` Chris Mason
  1 sibling, 1 reply; 20+ messages in thread
From: Kent Overstreet @ 2014-01-06 23:37 UTC (permalink / raw)
  To: Chris Mason, axboe
  Cc: fmpfeifer@gmail.com, linux-btrfs@vger.kernel.org, eab@gmx.ch,
	linux-bcache@vger.kernel.org

On Fri, Dec 20, 2013 at 03:46:30PM +0000, Chris Mason wrote:
> On Fri, 2013-12-20 at 10:42 -0200, Fábio Pfeifer wrote:
> > Hello,
> > 
> > I put the "WARN_ON(1);" after the printk lines (incomplete page read
> > and incomplete page write) in extent_io.c.
> > 
> > here some call traces:
> > 
> > [   19.509497] incomplete page read in btrfs with offset 2560 and length 1536
> > [   19.509500] ------------[ cut here ]------------
> > [   19.509528] WARNING: CPU: 2 PID: 220 at fs/btrfs/extent_io.c:2441
> > end_bio_extent_readpage+0x788/0xc20 [btrfs]()
> > [   19.509530] Modules linked in: cdc_acm fuse iTCO_wdt
> > iTCO_vendor_support snd_hda_codec_analog coretemp kvm_intel kvm raid1
> > ext4 crc16 md_mod mbcache jbd2 microcode nvidia(PO) psmouse pcspkr
> > evdev serio_raw i2c_i801 lpc_ich i2c_core snd_hda_intel sky2 skge
> > i82975x_edac button asus_atk0110 snd_hda_codec snd_hwdep shpchp
> > snd_pcm snd_page_alloc snd_timer acpi_cpufreq snd edac_core soundcore
> > processor vboxdrv(O) sr_mod cdrom ata_generic pata_acpi hid_generic
> > usbhid hid usb_storage sd_mod pata_marvell firewire_ohci uhci_hcd ahci
> > ehci_pci firewire_core ata_piix libahci crc_itu_t ehci_hcd libata
> > scsi_mod usbcore usb_common btrfs crc32c libcrc32c xor raid6_pq bcache
> > [   19.509578] CPU: 2 PID: 220 Comm: btrfs-endio-met Tainted: P
> > W  O 3.12.5-1-ARCH #1
> > [   19.509580] Hardware name: System manufacturer System Product
> > Name/P5WDG2 WS Pro, BIOS 0905    03/06/2008
> > [   19.509581]  0000000000000009 ffff880231a63cb0 ffffffff814ee37b
> > 0000000000000000
> > [   19.509585]  ffff880231a63ce8 ffffffff81062bcd ffffea00085eaec0
> > 0000000000000000
> > [   19.509587]  ffff8802320cc9c0 0000000000000000 ffff880233b0e000
> > ffff880231a63cf8
> > [   19.509590] Call Trace:
> > [   19.509596]  [<ffffffff814ee37b>] dump_stack+0x54/0x8d
> > [   19.509601]  [<ffffffff81062bcd>] warn_slowpath_common+0x7d/0xa0
> > [   19.509603]  [<ffffffff81062caa>] warn_slowpath_null+0x1a/0x20
> > [   19.509614]  [<ffffffffa00b7ba8>] end_bio_extent_readpage+0x788/0xc20 [btrfs]
> 
> This should mean that bcache is either failing to read some blocks
> properly or is fiddling with the bv_len/bv_offset fields.
> 
> Could someone from bcache comment?

Oh man, I found this and then threw up my hands in despair.

Bcache isn't doing anything with the bv_len/bv_offset fields; it may clone the
biovec so it can retry a bio on error, if the biovecs weren't all whole pages,
otherwise it just passes the biovec down with the next bio to the underlying
cache/backing device.

What btrfs appears to be doing though - I couldn't believe that code actually
_worked_, Jens please jump in here but AFAIK bv_len/bv_offset are in practice
undefined after a bio's completed, they might have been updated if the driver
was using blk_update_request but for many drivers that just process the entire
bio all at once they just won't touch those fields - and that includes anything
that clones the bio (md/dm).

This is probably relevant to immutable biovecs here...

-------------

Ok, I looked again at the relevant btrfs code, I guess I can see how this printk
isn't normally triggered. But Chris, _what on earth_ is btrfs trying to check
for here? And why is it using bv_offset and bv_len further down in
end_bio_extent_readpage()?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2014-01-06 23:37       ` Kent Overstreet
@ 2014-01-08 19:35         ` Chris Mason
  2014-01-08 21:13           ` Kent Overstreet
  0 siblings, 1 reply; 20+ messages in thread
From: Chris Mason @ 2014-01-08 19:35 UTC (permalink / raw)
  To: kmo@daterainc.com
  Cc: linux-btrfs@vger.kernel.org, eab@gmx.ch,
	linux-bcache@vger.kernel.org, fmpfeifer@gmail.com,
	axboe@kernel.dk

On Mon, 2014-01-06 at 15:37 -0800, Kent Overstreet wrote:
> On Fri, Dec 20, 2013 at 03:46:30PM +0000, Chris Mason wrote:
> > On Fri, 2013-12-20 at 10:42 -0200, Fábio Pfeifer wrote:
> > > Hello,
> > > 
> > > I put the "WARN_ON(1);" after the printk lines (incomplete page read
> > > and incomplete page write) in extent_io.c.
> > > 
> > > here some call traces:
> > > 
> > > [   19.509497] incomplete page read in btrfs with offset 2560 and length 1536
> > > [   19.509500] ------------[ cut here ]------------
> > > [   19.509528] WARNING: CPU: 2 PID: 220 at fs/btrfs/extent_io.c:2441
> > > end_bio_extent_readpage+0x788/0xc20 [btrfs]()
> > > [   19.509530] Modules linked in: cdc_acm fuse iTCO_wdt
> > > iTCO_vendor_support snd_hda_codec_analog coretemp kvm_intel kvm raid1
> > > ext4 crc16 md_mod mbcache jbd2 microcode nvidia(PO) psmouse pcspkr
> > > evdev serio_raw i2c_i801 lpc_ich i2c_core snd_hda_intel sky2 skge
> > > i82975x_edac button asus_atk0110 snd_hda_codec snd_hwdep shpchp
> > > snd_pcm snd_page_alloc snd_timer acpi_cpufreq snd edac_core soundcore
> > > processor vboxdrv(O) sr_mod cdrom ata_generic pata_acpi hid_generic
> > > usbhid hid usb_storage sd_mod pata_marvell firewire_ohci uhci_hcd ahci
> > > ehci_pci firewire_core ata_piix libahci crc_itu_t ehci_hcd libata
> > > scsi_mod usbcore usb_common btrfs crc32c libcrc32c xor raid6_pq bcache
> > > [   19.509578] CPU: 2 PID: 220 Comm: btrfs-endio-met Tainted: P
> > > W  O 3.12.5-1-ARCH #1
> > > [   19.509580] Hardware name: System manufacturer System Product
> > > Name/P5WDG2 WS Pro, BIOS 0905    03/06/2008
> > > [   19.509581]  0000000000000009 ffff880231a63cb0 ffffffff814ee37b
> > > 0000000000000000
> > > [   19.509585]  ffff880231a63ce8 ffffffff81062bcd ffffea00085eaec0
> > > 0000000000000000
> > > [   19.509587]  ffff8802320cc9c0 0000000000000000 ffff880233b0e000
> > > ffff880231a63cf8
> > > [   19.509590] Call Trace:
> > > [   19.509596]  [<ffffffff814ee37b>] dump_stack+0x54/0x8d
> > > [   19.509601]  [<ffffffff81062bcd>] warn_slowpath_common+0x7d/0xa0
> > > [   19.509603]  [<ffffffff81062caa>] warn_slowpath_null+0x1a/0x20
> > > [   19.509614]  [<ffffffffa00b7ba8>] end_bio_extent_readpage+0x788/0xc20 [btrfs]
> > 
> > This should mean that bcache is either failing to read some blocks
> > properly or is fiddling with the bv_len/bv_offset fields.
> > 
> > Could someone from bcache comment?
> 
> Oh man, I found this and then threw up my hands in despair.
> 
> Bcache isn't doing anything with the bv_len/bv_offset fields; it may clone the
> biovec so it can retry a bio on error, if the biovecs weren't all whole pages,
> otherwise it just passes the biovec down with the next bio to the underlying
> cache/backing device.
> 
> What btrfs appears to be doing though - I couldn't believe that code actually
> _worked_, Jens please jump in here but AFAIK bv_len/bv_offset are in practice
> undefined after a bio's completed, they might have been updated if the driver
> was using blk_update_request but for many drivers that just process the entire
> bio all at once they just won't touch those fields - and that includes anything
> that clones the bio (md/dm).
> 
> This is probably relevant to immutable biovecs here...
> 
> -------------
> 
> Ok, I looked again at the relevant btrfs code, I guess I can see how this printk
> isn't normally triggered. But Chris, _what on earth_ is btrfs trying to check
> for here? And why is it using bv_offset and bv_len further down in
> end_bio_extent_readpage()?

After the IO is done, we're recording the specific logical byte range
that covered the IO.  In practice its always the full page, we can
switch to just trusting PAGE_CACHE_SIZE.

-chris


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2014-01-08 19:35         ` Chris Mason
@ 2014-01-08 21:13           ` Kent Overstreet
  0 siblings, 0 replies; 20+ messages in thread
From: Kent Overstreet @ 2014-01-08 21:13 UTC (permalink / raw)
  To: Chris Mason
  Cc: linux-btrfs@vger.kernel.org, eab@gmx.ch,
	linux-bcache@vger.kernel.org, fmpfeifer@gmail.com,
	axboe@kernel.dk

On Wed, Jan 08, 2014 at 07:35:32PM +0000, Chris Mason wrote:
> On Mon, 2014-01-06 at 15:37 -0800, Kent Overstreet wrote:
> > Ok, I looked again at the relevant btrfs code, I guess I can see how this printk
> > isn't normally triggered. But Chris, _what on earth_ is btrfs trying to check
> > for here? And why is it using bv_offset and bv_len further down in
> > end_bio_extent_readpage()?
> 
> After the IO is done, we're recording the specific logical byte range
> that covered the IO.  In practice its always the full page, we can
> switch to just trusting PAGE_CACHE_SIZE.

Yeah, the code already assumes it was doing PAGE_CACHE_SIZE reads; what
you're effectively checking is that the driver did the bvec all at once,
and that it didn't process half a bvec, update it, then process the rest
- which is a completely fine thing to do.

So for now - yeah, the correct thing to do is to just ignore
bv_offset/bv_len and go by PAGE_CACHE_SIZE. But - after immutable
biovecs is in, _then_ you'll be able to depend on bv_offset/bv_len
remaining unchanged (and you can get rid of your dependency on
PAGE_CACHE_SIZE bvecs).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* btrfs on bcache
@ 2014-04-30 18:16 Felix Homann
  2014-05-01 11:33 ` Austin S Hemmelgarn
  0 siblings, 1 reply; 20+ messages in thread
From: Felix Homann @ 2014-04-30 18:16 UTC (permalink / raw)
  To: linux-btrfs

Hi,
a couple of months ago there has been some discussion about issues
when using btrfs on bcache:

http://thread.gmane.org/gmane.comp.file-systems.btrfs/31018

>From looking at the mailing list archives I cannot tell whether or not
this issue has been resolved in current kernels from either bcache's
or btrfs' side.

Can anyone tell me what's the current state of this issue? Should it
be safe to use btrfs on bcache by now?

Thanks and kind regards,
Felix

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2014-04-30 18:16 Felix Homann
@ 2014-05-01 11:33 ` Austin S Hemmelgarn
  0 siblings, 0 replies; 20+ messages in thread
From: Austin S Hemmelgarn @ 2014-05-01 11:33 UTC (permalink / raw)
  To: Felix Homann, linux-btrfs

On 2014-04-30 14:16, Felix Homann wrote:
> Hi,
> a couple of months ago there has been some discussion about issues
> when using btrfs on bcache:
> 
> http://thread.gmane.org/gmane.comp.file-systems.btrfs/31018
> 
> From looking at the mailing list archives I cannot tell whether or not
> this issue has been resolved in current kernels from either bcache's
> or btrfs' side.
> 
> Can anyone tell me what's the current state of this issue? Should it
> be safe to use btrfs on bcache by now?

In all practicality, I don't think anyone who frequents the list knows.
 I do know that there are a number of people (myself included) who avoid
bcache in general because of having issues with seemingly random kernel
OOPSes when it is linked in (either as a module or compiled in), even
when it isn't being used.  My advice would be to just test it with some
non-essential data (maybe set up a virtual machine?).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* btrfs on bcache
@ 2014-07-30 22:04 dptrash
  2014-07-30 23:01 ` Larkin Lowrey
  0 siblings, 1 reply; 20+ messages in thread
From: dptrash @ 2014-07-30 22:04 UTC (permalink / raw)
  To: linux-bcache, linux-btrfs

Concerning http://thread.gmane.org/gmane.comp.file-systems.btrfs/31018, does this "bug" still exists?

Kernel 3.14
B: 2x HDD 1 TB
C: 1x SSD 256 GB

# make-bcache -B /dev/sda /dev/sdb -C /dev/sdc --cache_replacement_policy=lru
# mkfs.btrfs -d raid1 -m raid1 -L "BTRFS_RAID" /dev/bcache0 /dev/bcache1

I still have no "incomplete page write" messages in "dmesg | grep btrfs" and the checksums of some manually reviewed files are okay.

Who has more experiences about this?

Thanks,

- dp

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2014-07-30 22:04 dptrash
@ 2014-07-30 23:01 ` Larkin Lowrey
  2014-08-04 12:57   ` Fábio Pfeifer
  0 siblings, 1 reply; 20+ messages in thread
From: Larkin Lowrey @ 2014-07-30 23:01 UTC (permalink / raw)
  To: dptrash, linux-bcache, linux-btrfs

I've been running two backup servers, with 25T and 20T of data, using
btrfs on bcache (writeback) for about 7 months. I periodically run btrfs
scrubs and backup verifies (SHA1 hashes) and have never had a corruption
issue.

My use of btrfs is simple, though, with no subvolumes and no btrfs level
raid. My bcache backing devices are LVM volumes that span multiple md
raid6 arrays. So, either the bug has been fixed or my configuration is
not susceptible.

I'm running kernel 3.15.5-200.fc20.x86_64.

--Larkin

On 7/30/2014 5:04 PM, dptrash@arcor.de wrote:
> Concerning http://thread.gmane.org/gmane.comp.file-systems.btrfs/31018, does this "bug" still exists?
>
> Kernel 3.14
> B: 2x HDD 1 TB
> C: 1x SSD 256 GB
>
> # make-bcache -B /dev/sda /dev/sdb -C /dev/sdc --cache_replacement_policy=lru
> # mkfs.btrfs -d raid1 -m raid1 -L "BTRFS_RAID" /dev/bcache0 /dev/bcache1
>
> I still have no "incomplete page write" messages in "dmesg | grep btrfs" and the checksums of some manually reviewed files are okay.
>
> Who has more experiences about this?
>
> Thanks,
>
> - dp
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 20+ messages in thread

* btrfs on bcache
       [not found] <1731942750.1162128.1406757898913.JavaMail.ngmail@webmail06.arcor-online.net>
@ 2014-07-31 15:35 ` dptrash
  2014-08-01  1:55   ` Duncan
  0 siblings, 1 reply; 20+ messages in thread
From: dptrash @ 2014-07-31 15:35 UTC (permalink / raw)
  To: linux-btrfs

Concerning http://thread.gmane.org/gmane.comp.file-systems.btrfs/31018, does this "bug" still exists?

Kernel 3.14
B: 2x HDD 1 TB
C: 1x SSD 256 GB

# make-bcache -B /dev/sda /dev/sdb -C /dev/sdc --cache_replacement_policy=lru
# mkfs.btrfs -d raid1 -m raid1 -L "BTRFS_RAID" /dev/bcache0 /dev/bcache1

I still have no "incomplete page write" messages in "dmesg | grep btrfs" and the checksums of some manually reviewed files are okay.

Who has more experiences about this?

Thanks,

- dp

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2014-07-31 15:35 ` btrfs on bcache dptrash
@ 2014-08-01  1:55   ` Duncan
  0 siblings, 0 replies; 20+ messages in thread
From: Duncan @ 2014-08-01  1:55 UTC (permalink / raw)
  To: linux-btrfs

dptrash posted on Thu, 31 Jul 2014 17:35:44 +0200 as excerpted:

> Concerning http://thread.gmane.org/gmane.comp.file-systems.btrfs/31018,
> does this "bug" still exists?
> 
> Kernel 3.14 B: 2x HDD 1 TB C: 1x SSD 256 GB
> 
> # make-bcache -B /dev/sda /dev/sdb -C /dev/sdc
> --cache_replacement_policy=lru
> # mkfs.btrfs -d raid1 -m raid1 -L "BTRFS_RAID" /dev/bcache0 /dev/bcache1
> 
> I still have no "incomplete page write" messages in "dmesg | grep btrfs"
> and the checksums of some manually reviewed files are okay.
> 
> Who has more experiences about this?

See the reply (not mine) to your earlier post of the question:

http://permalink.gmane.org/gmane.linux.kernel.bcache.devel/2602

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
  2014-07-30 23:01 ` Larkin Lowrey
@ 2014-08-04 12:57   ` Fábio Pfeifer
  0 siblings, 0 replies; 20+ messages in thread
From: Fábio Pfeifer @ 2014-08-04 12:57 UTC (permalink / raw)
  To: Larkin Lowrey
  Cc: dptrash, linux-bcache@vger.kernel.org,
	linux-btrfs@vger.kernel.org

After completely loosing my filesystem twice because of this bug, I gave
up using btrfs on top of bcache (also writeback). In my case, I used to
have some subvolumes and some snapshot of these subvolumes, but not many
of them. The btrfs mantra "backup, bakcup and backup" saved me.

Best regards,

Fábio Pfeifer

2014-07-30 20:01 GMT-03:00 Larkin Lowrey <llowrey@nuclearwinter.com>:
> I've been running two backup servers, with 25T and 20T of data, using
> btrfs on bcache (writeback) for about 7 months. I periodically run btrfs
> scrubs and backup verifies (SHA1 hashes) and have never had a corruption
> issue.
>
> My use of btrfs is simple, though, with no subvolumes and no btrfs level
> raid. My bcache backing devices are LVM volumes that span multiple md
> raid6 arrays. So, either the bug has been fixed or my configuration is
> not susceptible.
>
> I'm running kernel 3.15.5-200.fc20.x86_64.
>
> --Larkin
>
> On 7/30/2014 5:04 PM, dptrash@arcor.de wrote:
>> Concerning http://thread.gmane.org/gmane.comp.file-systems.btrfs/31018, does this "bug" still exists?
>>
>> Kernel 3.14
>> B: 2x HDD 1 TB
>> C: 1x SSD 256 GB
>>
>> # make-bcache -B /dev/sda /dev/sdb -C /dev/sdc --cache_replacement_policy=lru
>> # mkfs.btrfs -d raid1 -m raid1 -L "BTRFS_RAID" /dev/bcache0 /dev/bcache1
>>
>> I still have no "incomplete page write" messages in "dmesg | grep btrfs" and the checksums of some manually reviewed files are okay.
>>
>> Who has more experiences about this?
>>
>> Thanks,
>>
>> - dp
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: btrfs on bcache
@ 2014-08-20 20:17 raphead
  0 siblings, 0 replies; 20+ messages in thread
From: raphead @ 2014-08-20 20:17 UTC (permalink / raw)
  To: linux-btrfs

Hi,
has this issue been resolved?
I would like to use the bcache + btrfs combo.
Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-08-20 20:17 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1731942750.1162128.1406757898913.JavaMail.ngmail@webmail06.arcor-online.net>
2014-07-31 15:35 ` btrfs on bcache dptrash
2014-08-01  1:55   ` Duncan
2014-08-20 20:17 raphead
  -- strict thread matches above, loose matches on Subject: below --
2014-07-30 22:04 dptrash
2014-07-30 23:01 ` Larkin Lowrey
2014-08-04 12:57   ` Fábio Pfeifer
2014-04-30 18:16 Felix Homann
2014-05-01 11:33 ` Austin S Hemmelgarn
2013-12-18 17:17 eb
2013-12-19 19:04 ` Fábio Pfeifer
2013-12-19 19:05   ` Fábio Pfeifer
2013-12-20 22:26   ` Henry de Valence
2013-12-19 19:59 ` Chris Mason
2013-12-20 12:36   ` eb
2013-12-20 12:42   ` Fábio Pfeifer
2013-12-20 15:46     ` Chris Mason
2013-12-24 16:44       ` Fábio Pfeifer
2014-01-06 23:37       ` Kent Overstreet
2014-01-08 19:35         ` Chris Mason
2014-01-08 21:13           ` Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).