public inbox for linux-bcache@vger.kernel.org
 help / color / mirror / Atom feed
* bcache: btree_split() couldn't split
@ 2014-05-11 16:52 Zhe Yang
  0 siblings, 0 replies; 5+ messages in thread
From: Zhe Yang @ 2014-05-11 16:52 UTC (permalink / raw)
  To: linux-bcache

Hello,

I'm a bcache user. During use of bcache, I hit this situation very
often. Every time after this, filesystem upon bcache's device was
automatically remounted ro and need a fsck.

I'm using archlinux with stocking 3.13.5 ~ 3.14.1 kernel. Is there any
way to prevent hitting this situation?

BTW, I'd also like to propose a feature related to performance. Bcache
use LRU by default. So every data MISS will write bcache. But SSD
drive can't run normally at writing 20MB/s all the time. Some SSD will
have up to 50ms delay for each write under this kind of stress. Thus,
the overall performance of bcache degrades to HDD, just because of the
write of MISSed data. Could you implement a ratio, for example only
50% MISSed data could be written to SSD?

Sincerely,
    Zhe Yang

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bcache: btree_split() couldn't split
@ 2014-05-12 11:53 Mariusz Paradowski
  2014-05-12 16:14 ` Rolf Fokkens
  2014-05-13 17:14 ` Slava Pestov
  0 siblings, 2 replies; 5+ messages in thread
From: Mariusz Paradowski @ 2014-05-12 11:53 UTC (permalink / raw)
  To: linux-bcache

Confirmed on kernel 3.14.3 from kernel.org:

May 11 17:43:16 x kernel: ------------[ cut here ]------------
May 11 17:43:16 x kernel: WARNING: CPU: 3 PID: 376101 at 
drivers/md/bcache/btree.c:1979 0xffffffffa00d65ab()
May 11 17:43:16 x kernel: bcache: btree split failed
May 11 17:43:16 x kernel: Modules linked in: e1000e ptp pps_core 
microcode firmware_class unix mpt2sas raid_class scsi_transport_sas 
bcache fuse hid_generic usbhid hid xhci_hcd ehci_pci ehci_hcd usbcore 
usb_common msr cpuid
May 11 17:43:16 x kernel: CPU: 3 PID: 376101 Comm: kworker/3:2 Not 
tainted 3.14.3 #1
May 11 17:43:16 x kernel: Hardware name:                  /DH87MC, BIOS 
MCH8710H.86A.0047.2013.0606.1508 06/06/2013
May 11 17:43:16 x kernel: Workqueue: events 0xffffffffa00e8fa0
May 11 17:43:16 x kernel: 0000000000000009 ffffffff81303a63 
ffff88040c24b988 ffffffff8104c2fd
May 11 17:43:16 x kernel: ffff8801056f2400 ffff88040c24b9d8 
ffff88040c24ba00 ffff88040c24bd10
May 11 17:43:16 x kernel: ffffffffffffffe4 ffffffff8104c367 
ffffffffa00ea33b ffff880400000018
May 11 17:43:16 x kernel: Call Trace:
May 11 17:43:16 x kernel: [<ffffffff81303a63>] ? 0xffffffff81303a63
May 11 17:43:16 x kernel: [<ffffffff8104c2fd>] ? 0xffffffff8104c2fd
May 11 17:43:16 x kernel: [<ffffffff8104c367>] ? 0xffffffff8104c367
May 11 17:43:16 x kernel: [<ffffffffa00d65ab>] ? 0xffffffffa00d65ab
May 11 17:43:16 x kernel: [<ffffffff810752c3>] ? 0xffffffff810752c3
May 11 17:43:16 x kernel: [<ffffffffa00d669d>] ? 0xffffffffa00d669d
May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
May 11 17:43:16 x kernel: [<ffffffffa00d753b>] ? 0xffffffffa00d753b
May 11 17:43:16 x kernel: [<ffffffffa00d4bce>] ? 0xffffffffa00d4bce
May 11 17:43:16 x kernel: [<ffffffffa00d12a9>] ? 0xffffffffa00d12a9
May 11 17:43:16 x kernel: [<ffffffffa00d4975>] ? 0xffffffffa00d4975
May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
May 11 17:43:16 x kernel: [<ffffffffa00d4c65>] ? 0xffffffffa00d4c65
May 11 17:43:16 x kernel: [<ffffffff811bc9c4>] ? 0xffffffff811bc9c4
May 11 17:43:16 x kernel: [<ffffffffa00d7d2c>] ? 0xffffffffa00d7d2c
May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
May 11 17:43:16 x kernel: [<ffffffffa00d7e98>] ? 0xffffffffa00d7e98
May 11 17:43:16 x kernel: [<ffffffff81079110>] ? 0xffffffff81079110
May 11 17:43:16 x kernel: [<ffffffffa00e914a>] ? 0xffffffffa00e914a
May 11 17:43:16 x kernel: [<ffffffff81054cb1>] ? 0xffffffff81054cb1
May 11 17:43:16 x kernel: [<ffffffff81054b9d>] ? 0xffffffff81054b9d
May 11 17:43:16 x kernel: [<ffffffff81054eaf>] ? 0xffffffff81054eaf
May 11 17:43:16 x kernel: [<ffffffff8105e9a1>] ? 0xffffffff8105e9a1
May 11 17:43:16 x kernel: [<ffffffff8105c9f3>] ? 0xffffffff8105c9f3
May 11 17:43:16 x kernel: [<ffffffff8105f566>] ? 0xffffffff8105f566
May 11 17:43:16 x kernel: [<ffffffff8105f450>] ? 0xffffffff8105f450
May 11 17:43:16 x kernel: [<ffffffff81064621>] ? 0xffffffff81064621
May 11 17:43:16 x kernel: [<ffffffff81064560>] ? 0xffffffff81064560
May 11 17:43:16 x kernel: [<ffffffff8130853c>] ? 0xffffffff8130853c
May 11 17:43:16 x kernel: [<ffffffff81064560>] ? 0xffffffff81064560
May 11 17:43:16 x kernel: ---[ end trace 4fa5a49292304c0d ]---
May 11 17:43:16 x kernel: bcache: bch_btree_insert() error -12
-- 
Mariusz Paradowski

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bcache: btree_split() couldn't split
  2014-05-12 11:53 bcache: btree_split() couldn't split Mariusz Paradowski
@ 2014-05-12 16:14 ` Rolf Fokkens
  2014-05-13 20:41   ` Mariusz Paradowski
  2014-05-13 17:14 ` Slava Pestov
  1 sibling, 1 reply; 5+ messages in thread
From: Rolf Fokkens @ 2014-05-12 16:14 UTC (permalink / raw)
  To: Mariusz Paradowski, linux-bcache

So far no problems here, I have been using bcache since october. But I'm 
currently running kernel 3.14.3, so I might be at risk?

Is this issue there since a specific kernel version?

On 05/12/2014 01:53 PM, Mariusz Paradowski wrote:
> Confirmed on kernel 3.14.3 from kernel.org:
>
> May 11 17:43:16 x kernel: ------------[ cut here ]------------
> May 11 17:43:16 x kernel: WARNING: CPU: 3 PID: 376101 at 
> drivers/md/bcache/btree.c:1979 0xffffffffa00d65ab()
> May 11 17:43:16 x kernel: bcache: btree split failed
> May 11 17:43:16 x kernel: Modules linked in: e1000e ptp pps_core 
> microcode firmware_class unix mpt2sas raid_class scsi_transport_sas 
> bcache fuse hid_generic usbhid hid xhci_hcd ehci_pci ehci_hcd usbcore 
> usb_common msr cpuid
> May 11 17:43:16 x kernel: CPU: 3 PID: 376101 Comm: kworker/3:2 Not 
> tainted 3.14.3 #1
> May 11 17:43:16 x kernel: Hardware name:                  /DH87MC, 
> BIOS MCH8710H.86A.0047.2013.0606.1508 06/06/2013
> May 11 17:43:16 x kernel: Workqueue: events 0xffffffffa00e8fa0
> May 11 17:43:16 x kernel: 0000000000000009 ffffffff81303a63 
> ffff88040c24b988 ffffffff8104c2fd
> May 11 17:43:16 x kernel: ffff8801056f2400 ffff88040c24b9d8 
> ffff88040c24ba00 ffff88040c24bd10
> May 11 17:43:16 x kernel: ffffffffffffffe4 ffffffff8104c367 
> ffffffffa00ea33b ffff880400000018
> May 11 17:43:16 x kernel: Call Trace:
> May 11 17:43:16 x kernel: [<ffffffff81303a63>] ? 0xffffffff81303a63
> May 11 17:43:16 x kernel: [<ffffffff8104c2fd>] ? 0xffffffff8104c2fd
> May 11 17:43:16 x kernel: [<ffffffff8104c367>] ? 0xffffffff8104c367
> May 11 17:43:16 x kernel: [<ffffffffa00d65ab>] ? 0xffffffffa00d65ab
> May 11 17:43:16 x kernel: [<ffffffff810752c3>] ? 0xffffffff810752c3
> May 11 17:43:16 x kernel: [<ffffffffa00d669d>] ? 0xffffffffa00d669d
> May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
> May 11 17:43:16 x kernel: [<ffffffffa00d753b>] ? 0xffffffffa00d753b
> May 11 17:43:16 x kernel: [<ffffffffa00d4bce>] ? 0xffffffffa00d4bce
> May 11 17:43:16 x kernel: [<ffffffffa00d12a9>] ? 0xffffffffa00d12a9
> May 11 17:43:16 x kernel: [<ffffffffa00d4975>] ? 0xffffffffa00d4975
> May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
> May 11 17:43:16 x kernel: [<ffffffffa00d4c65>] ? 0xffffffffa00d4c65
> May 11 17:43:16 x kernel: [<ffffffff811bc9c4>] ? 0xffffffff811bc9c4
> May 11 17:43:16 x kernel: [<ffffffffa00d7d2c>] ? 0xffffffffa00d7d2c
> May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
> May 11 17:43:16 x kernel: [<ffffffffa00d7e98>] ? 0xffffffffa00d7e98
> May 11 17:43:16 x kernel: [<ffffffff81079110>] ? 0xffffffff81079110
> May 11 17:43:16 x kernel: [<ffffffffa00e914a>] ? 0xffffffffa00e914a
> May 11 17:43:16 x kernel: [<ffffffff81054cb1>] ? 0xffffffff81054cb1
> May 11 17:43:16 x kernel: [<ffffffff81054b9d>] ? 0xffffffff81054b9d
> May 11 17:43:16 x kernel: [<ffffffff81054eaf>] ? 0xffffffff81054eaf
> May 11 17:43:16 x kernel: [<ffffffff8105e9a1>] ? 0xffffffff8105e9a1
> May 11 17:43:16 x kernel: [<ffffffff8105c9f3>] ? 0xffffffff8105c9f3
> May 11 17:43:16 x kernel: [<ffffffff8105f566>] ? 0xffffffff8105f566
> May 11 17:43:16 x kernel: [<ffffffff8105f450>] ? 0xffffffff8105f450
> May 11 17:43:16 x kernel: [<ffffffff81064621>] ? 0xffffffff81064621
> May 11 17:43:16 x kernel: [<ffffffff81064560>] ? 0xffffffff81064560
> May 11 17:43:16 x kernel: [<ffffffff8130853c>] ? 0xffffffff8130853c
> May 11 17:43:16 x kernel: [<ffffffff81064560>] ? 0xffffffff81064560
> May 11 17:43:16 x kernel: ---[ end trace 4fa5a49292304c0d ]---
> May 11 17:43:16 x kernel: bcache: bch_btree_insert() error -12

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bcache: btree_split() couldn't split
  2014-05-12 11:53 bcache: btree_split() couldn't split Mariusz Paradowski
  2014-05-12 16:14 ` Rolf Fokkens
@ 2014-05-13 17:14 ` Slava Pestov
  1 sibling, 0 replies; 5+ messages in thread
From: Slava Pestov @ 2014-05-13 17:14 UTC (permalink / raw)
  To: Mariusz Paradowski; +Cc: linux-bcache

Hi Zhe and Mariusz,

Based on my understanding of the code, this problem only occurs with
3.14 and older kernels. I believe Kent fixed this bug in v3.15-rc1
with this patch:

commit 0a63b66db566cffdf90182eb6e66fdd4d0479e63
Author: Kent Overstreet <kmo@daterainc.com>
Date:   Mon Mar 17 17:15:53 2014 -0700

    bcache: Rework btree cache reserve handling

    This changes the bucket allocation reserves to use _real_ reserves
- separate
    freelists - instead of watermarks, which if nothing else makes the
current code
    saner to reason about and is going to be important in the future when we add
    support for multiple btrees.

    It also adds btree_check_reserve(), which checks (and locks) the
reserves for
    both bucket allocation and memory allocation for btree nodes; the
old code just
    kinda sorta assumed that since (e.g. for btree node splits) it had the root
    locked and that meant no other threads could try to make use of the same
    reserve; this technically should have been ok for memory
allocation (we should
    always have a reserve for memory allocation (the btree node cache
is used as a
    reserve and we preallocate it)), but multiple btrees will mean
that locking the
    root won't be sufficient anymore, and for the bucket allocation
reserve it was
    technically possible for the old code to deadlock.

    Signed-off-by: Kent Overstreet <kmo@daterainc.com>

On Mon, May 12, 2014 at 4:53 AM, Mariusz Paradowski
<indianin@indianin.net> wrote:
> Confirmed on kernel 3.14.3 from kernel.org:
>
> May 11 17:43:16 x kernel: ------------[ cut here ]------------
> May 11 17:43:16 x kernel: WARNING: CPU: 3 PID: 376101 at
> drivers/md/bcache/btree.c:1979 0xffffffffa00d65ab()
> May 11 17:43:16 x kernel: bcache: btree split failed
> May 11 17:43:16 x kernel: Modules linked in: e1000e ptp pps_core microcode
> firmware_class unix mpt2sas raid_class scsi_transport_sas bcache fuse
> hid_generic usbhid hid xhci_hcd ehci_pci ehci_hcd usbcore usb_common msr
> cpuid
> May 11 17:43:16 x kernel: CPU: 3 PID: 376101 Comm: kworker/3:2 Not tainted
> 3.14.3 #1
> May 11 17:43:16 x kernel: Hardware name:                  /DH87MC, BIOS
> MCH8710H.86A.0047.2013.0606.1508 06/06/2013
> May 11 17:43:16 x kernel: Workqueue: events 0xffffffffa00e8fa0
> May 11 17:43:16 x kernel: 0000000000000009 ffffffff81303a63 ffff88040c24b988
> ffffffff8104c2fd
> May 11 17:43:16 x kernel: ffff8801056f2400 ffff88040c24b9d8 ffff88040c24ba00
> ffff88040c24bd10
> May 11 17:43:16 x kernel: ffffffffffffffe4 ffffffff8104c367 ffffffffa00ea33b
> ffff880400000018
> May 11 17:43:16 x kernel: Call Trace:
> May 11 17:43:16 x kernel: [<ffffffff81303a63>] ? 0xffffffff81303a63
> May 11 17:43:16 x kernel: [<ffffffff8104c2fd>] ? 0xffffffff8104c2fd
> May 11 17:43:16 x kernel: [<ffffffff8104c367>] ? 0xffffffff8104c367
> May 11 17:43:16 x kernel: [<ffffffffa00d65ab>] ? 0xffffffffa00d65ab
> May 11 17:43:16 x kernel: [<ffffffff810752c3>] ? 0xffffffff810752c3
> May 11 17:43:16 x kernel: [<ffffffffa00d669d>] ? 0xffffffffa00d669d
> May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
> May 11 17:43:16 x kernel: [<ffffffffa00d753b>] ? 0xffffffffa00d753b
> May 11 17:43:16 x kernel: [<ffffffffa00d4bce>] ? 0xffffffffa00d4bce
> May 11 17:43:16 x kernel: [<ffffffffa00d12a9>] ? 0xffffffffa00d12a9
> May 11 17:43:16 x kernel: [<ffffffffa00d4975>] ? 0xffffffffa00d4975
> May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
> May 11 17:43:16 x kernel: [<ffffffffa00d4c65>] ? 0xffffffffa00d4c65
> May 11 17:43:16 x kernel: [<ffffffff811bc9c4>] ? 0xffffffff811bc9c4
> May 11 17:43:16 x kernel: [<ffffffffa00d7d2c>] ? 0xffffffffa00d7d2c
> May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
> May 11 17:43:16 x kernel: [<ffffffffa00d7e98>] ? 0xffffffffa00d7e98
> May 11 17:43:16 x kernel: [<ffffffff81079110>] ? 0xffffffff81079110
> May 11 17:43:16 x kernel: [<ffffffffa00e914a>] ? 0xffffffffa00e914a
> May 11 17:43:16 x kernel: [<ffffffff81054cb1>] ? 0xffffffff81054cb1
> May 11 17:43:16 x kernel: [<ffffffff81054b9d>] ? 0xffffffff81054b9d
> May 11 17:43:16 x kernel: [<ffffffff81054eaf>] ? 0xffffffff81054eaf
> May 11 17:43:16 x kernel: [<ffffffff8105e9a1>] ? 0xffffffff8105e9a1
> May 11 17:43:16 x kernel: [<ffffffff8105c9f3>] ? 0xffffffff8105c9f3
> May 11 17:43:16 x kernel: [<ffffffff8105f566>] ? 0xffffffff8105f566
> May 11 17:43:16 x kernel: [<ffffffff8105f450>] ? 0xffffffff8105f450
> May 11 17:43:16 x kernel: [<ffffffff81064621>] ? 0xffffffff81064621
> May 11 17:43:16 x kernel: [<ffffffff81064560>] ? 0xffffffff81064560
> May 11 17:43:16 x kernel: [<ffffffff8130853c>] ? 0xffffffff8130853c
> May 11 17:43:16 x kernel: [<ffffffff81064560>] ? 0xffffffff81064560
> May 11 17:43:16 x kernel: ---[ end trace 4fa5a49292304c0d ]---
> May 11 17:43:16 x kernel: bcache: bch_btree_insert() error -12
> --
> Mariusz Paradowski
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bcache: btree_split() couldn't split
  2014-05-12 16:14 ` Rolf Fokkens
@ 2014-05-13 20:41   ` Mariusz Paradowski
  0 siblings, 0 replies; 5+ messages in thread
From: Mariusz Paradowski @ 2014-05-13 20:41 UTC (permalink / raw)
  To: linux-bcache

On Mon, 12 May 2014 18:14:30 +0200,
in 5370F366.7030505@rolffokkens.nl,
Rolf Fokkens <rolf@rolffokkens.nl> wrote:

> So far no problems here, I have been using bcache since october. But
> I'm currently running kernel 3.14.3, so I might be at risk?

You might. Independently of this error it may be even riskier if you
are using bcache in writeback mode.


> Is this issue there since a specific kernel version?

I'm not sure. My previous kernel was 3.13.11 and it was even worse -
bcache in writeback mode destroyed my filesystem. I escaped from
3.13.11 to 3.14.3 with bcache in writethrough mode, but with this
version the split error occurs. Slava wrote more info on this.
-- 
Mariusz Paradowski

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-05-13 20:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-12 11:53 bcache: btree_split() couldn't split Mariusz Paradowski
2014-05-12 16:14 ` Rolf Fokkens
2014-05-13 20:41   ` Mariusz Paradowski
2014-05-13 17:14 ` Slava Pestov
  -- strict thread matches above, loose matches on Subject: below --
2014-05-11 16:52 Zhe Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox