* every boot gives: bcache/alloc.c:78 WARNING
@ 2016-08-03 23:19 Marc MERLIN
2016-08-04 0:26 ` Eric Wheeler
0 siblings, 1 reply; 7+ messages in thread
From: Marc MERLIN @ 2016-08-03 23:19 UTC (permalink / raw)
To: linux-bcache
This happens on all kernels up to 4.7.
Sorry, it happens earlier than ethernet coming up or my storage, so I can't
use netconsole or other text dumps:
https://goo.gl/photos/ubsi6maZXsjkevYY7
Looks like the warning happens on the registration of one of my bcache, but
I can't tell which one or why.
Does the trace give any hints?
(please ignore the load modules errors below, different issue)
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: every boot gives: bcache/alloc.c:78 WARNING 2016-08-03 23:19 every boot gives: bcache/alloc.c:78 WARNING Marc MERLIN @ 2016-08-04 0:26 ` Eric Wheeler 2016-08-04 0:33 ` Marc MERLIN 0 siblings, 1 reply; 7+ messages in thread From: Eric Wheeler @ 2016-08-04 0:26 UTC (permalink / raw) To: Marc MERLIN; +Cc: linux-bcache On Wed, 3 Aug 2016, Marc MERLIN wrote: > This happens on all kernels up to 4.7. > Sorry, it happens earlier than ethernet coming up or my storage, so I can't > use netconsole or other text dumps: > https://goo.gl/photos/ubsi6maZXsjkevYY7 > > Looks like the warning happens on the registration of one of my bcache, but > I can't tell which one or why. > > Does the trace give any hints? > (please ignore the load modules errors below, different issue) Does it cause a problem, or just warn? 73 uint8_t bch_inc_gen(struct cache *ca, struct bucket *b) 74 { 75 uint8_t ret = ++b->gen; 76 77 ca->set->need_gc = max(ca->set->need_gc, bucket_gc_gen(b)); 78 WARN_ON_ONCE(ca->set->need_gc > BUCKET_GC_GEN_MAX); 79 80 return ret; 81 } It looks like something needs garbage collected but perhaps isn't. You could write to sysfs/.../trigger_gc: https://evilpiepirate.org/git/linux-bcache.git/tree/Documentation/bcache.txt trigger_gc Writing to this file forces garbage collection to run. If that doesn't work, I wonder what increasing BUCKET_GC_GEN_MAX would do, though I don't know if that is safe. Its set to 96U, so its not on on a bit boundary which sounds like it could be slightly safer---but I wouldn't try it unless this is a test machine. -- Eric Wheeler > > Thanks, > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: every boot gives: bcache/alloc.c:78 WARNING 2016-08-04 0:26 ` Eric Wheeler @ 2016-08-04 0:33 ` Marc MERLIN 2016-08-04 1:43 ` Eric Wheeler 0 siblings, 1 reply; 7+ messages in thread From: Marc MERLIN @ 2016-08-04 0:33 UTC (permalink / raw) To: Eric Wheeler; +Cc: linux-bcache On Wed, Aug 03, 2016 at 05:26:30PM -0700, Eric Wheeler wrote: > On Wed, 3 Aug 2016, Marc MERLIN wrote: > > > This happens on all kernels up to 4.7. > > Sorry, it happens earlier than ethernet coming up or my storage, so I can't > > use netconsole or other text dumps: > > https://goo.gl/photos/ubsi6maZXsjkevYY7 > > > > Looks like the warning happens on the registration of one of my bcache, but > > I can't tell which one or why. > > > > Does the trace give any hints? > > (please ignore the load modules errors below, different issue) > > Does it cause a problem, or just warn? No problem, but since it's a warning, I'm reporting it. > It looks like something needs garbage collected but perhaps isn't. > > You could write to sysfs/.../trigger_gc: > > https://evilpiepirate.org/git/linux-bcache.git/tree/Documentation/bcache.txt > trigger_gc > Writing to this file forces garbage collection to run. BTW both this and the just released 4.7.0 are still missing the documentation updates I contributed months ago. Any idea what's going on there? > If that doesn't work, I wonder what increasing BUCKET_GC_GEN_MAX would do, > though I don't know if that is safe. Its set to 96U, so its not on on a > bit boundary which sounds like it could be slightly safer---but I wouldn't > try it unless this is a test machine. That's on the backing device, correct? I have 3 of them, and I can only write to them way way later in the boot process. Should I do that one by one and see if I get output now? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: every boot gives: bcache/alloc.c:78 WARNING 2016-08-04 0:33 ` Marc MERLIN @ 2016-08-04 1:43 ` Eric Wheeler 2016-08-04 2:44 ` Eric Wheeler 0 siblings, 1 reply; 7+ messages in thread From: Eric Wheeler @ 2016-08-04 1:43 UTC (permalink / raw) To: Marc MERLIN; +Cc: linux-bcache On Wed, 3 Aug 2016, Marc MERLIN wrote: > On Wed, Aug 03, 2016 at 05:26:30PM -0700, Eric Wheeler wrote: > > On Wed, 3 Aug 2016, Marc MERLIN wrote: > > > > > This happens on all kernels up to 4.7. > > > Sorry, it happens earlier than ethernet coming up or my storage, so I can't > > > use netconsole or other text dumps: > > > https://goo.gl/photos/ubsi6maZXsjkevYY7 > > > > > > Looks like the warning happens on the registration of one of my bcache, but > > > I can't tell which one or why. > > > > > > Does the trace give any hints? > > > (please ignore the load modules errors below, different issue) > > > > Does it cause a problem, or just warn? > > No problem, but since it's a warning, I'm reporting it. > > > It looks like something needs garbage collected but perhaps isn't. > > > > You could write to sysfs/.../trigger_gc: > > > > https://evilpiepirate.org/git/linux-bcache.git/tree/Documentation/bcache.txt > > trigger_gc > > Writing to this file forces garbage collection to run. > > BTW both this and the just released 4.7.0 are still missing the > documentation updates I contributed months ago. Any idea what's going on > there? > > > If that doesn't work, I wonder what increasing BUCKET_GC_GEN_MAX would do, > > though I don't know if that is safe. Its set to 96U, so its not on on a > > bit boundary which sounds like it could be slightly safer---but I wouldn't > > try it unless this is a test machine. > > That's on the backing device, correct? > I have 3 of them, and I can only write to them way way later in the boot > process. > Should I do that one by one and see if I get output now? When you say do "that" do you mean `trigger_gc` ? I think trigger_gc a cache thing, but the whole bcacheN dev might need to be online before it can be triggered (not sure). Backing devices really have metadata, just superblock. -- Eric Wheeler > > Thanks, > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: every boot gives: bcache/alloc.c:78 WARNING 2016-08-04 1:43 ` Eric Wheeler @ 2016-08-04 2:44 ` Eric Wheeler 2016-08-04 3:23 ` Marc MERLIN 0 siblings, 1 reply; 7+ messages in thread From: Eric Wheeler @ 2016-08-04 2:44 UTC (permalink / raw) To: Marc MERLIN; +Cc: linux-bcache On Wed, 3 Aug 2016, Eric Wheeler wrote: > > On Wed, 3 Aug 2016, Marc MERLIN wrote: > > > On Wed, Aug 03, 2016 at 05:26:30PM -0700, Eric Wheeler wrote: > > > On Wed, 3 Aug 2016, Marc MERLIN wrote: > > > > > > > This happens on all kernels up to 4.7. > > > > Sorry, it happens earlier than ethernet coming up or my storage, so I can't > > > > use netconsole or other text dumps: > > > > https://goo.gl/photos/ubsi6maZXsjkevYY7 > > > > > > > > Looks like the warning happens on the registration of one of my bcache, but > > > > I can't tell which one or why. > > > > > > > > Does the trace give any hints? > > > > (please ignore the load modules errors below, different issue) > > > > > > Does it cause a problem, or just warn? > > > > No problem, but since it's a warning, I'm reporting it. > > > > > It looks like something needs garbage collected but perhaps isn't. > > > > > > You could write to sysfs/.../trigger_gc: > > > > > > https://evilpiepirate.org/git/linux-bcache.git/tree/Documentation/bcache.txt > > > trigger_gc > > > Writing to this file forces garbage collection to run. > > > > BTW both this and the just released 4.7.0 are still missing the > > documentation updates I contributed months ago. Any idea what's going on > > there? > > > > > If that doesn't work, I wonder what increasing BUCKET_GC_GEN_MAX would do, > > > though I don't know if that is safe. Its set to 96U, so its not on on a > > > bit boundary which sounds like it could be slightly safer---but I wouldn't > > > try it unless this is a test machine. > > > > That's on the backing device, correct? > > I have 3 of them, and I can only write to them way way later in the boot > > process. > > Should I do that one by one and see if I get output now? > > > When you say do "that" do you mean `trigger_gc` ? > > I think trigger_gc a cache thing, but the whole bcacheN dev might need to > be online before it can be triggered (not sure). Backing devices really > have metadata, just superblock. I meant to say: Backing devices have no metadata, just superblock. -- Eric Wheeler > > -- > Eric Wheeler > > > > > > Thanks, > > Marc > > -- > > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > > Microsoft is to operating systems .... > > .... what McDonalds is to gourmet cooking > > Home page: http://marc.merlins.org/ > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: every boot gives: bcache/alloc.c:78 WARNING 2016-08-04 2:44 ` Eric Wheeler @ 2016-08-04 3:23 ` Marc MERLIN 2016-08-04 4:32 ` Eric Wheeler 0 siblings, 1 reply; 7+ messages in thread From: Marc MERLIN @ 2016-08-04 3:23 UTC (permalink / raw) To: Eric Wheeler; +Cc: linux-bcache On Wed, Aug 03, 2016 at 07:44:41PM -0700, Eric Wheeler wrote: > > When you say do "that" do you mean `trigger_gc` ? > > > > I think trigger_gc a cache thing, but the whole bcacheN dev might need to > > be online before it can be triggered (not sure). Backing devices really > > have metadata, just superblock. > > I meant to say: > > Backing devices have no metadata, just superblock. Mmmh, doing this just killed my cache: saruman:/sys/block/bcache0# echo 1 > /sys/fs/bcache/7f2e1508-8db6-48cb-85d6-606c88f81f63/internal/trigger_gc [ 1639.204612] bcache: error on fc8cd783-346b-48f5-a619-fb0380584aa9: key too stale: 97, need_gc 97, disabling caching [ 1639.204625] CPU: 7 PID: 519 Comm: bcache_gc Tainted: G W OE 4.4.5-amd64-volpreempt-sysrq-20160312bc5 #10 [ 1639.204627] Hardware name: LENOVO 20ERCTO1WW/20ERCTO1WW, BIOS N1DET41W (1.15 ) 12/31/2015 [ 1639.204629] 0000000000000000 ffff8808781fbbc0 ffffffff8134d88e ffff880875040ab8 [ 1639.204635] ffff88087a3edcd0 ffff8808781fbc00 ffffffffc03b8609 0000000000000001 [ 1639.204639] ffff880875040ab8 ffffffffc03b17ed ffff88087a3edcd0 ffff8808781fbc50 [ 1639.204644] Call Trace: [ 1639.204649] [<ffffffff8134d88e>] dump_stack+0x61/0x7d [ 1639.204668] [<ffffffffc03b8609>] bch_extent_bad+0xd7/0x12b [bcache] [ 1639.204677] [<ffffffffc03b17ed>] ? bch_ptr_invalid+0xc/0xc [bcache] [ 1639.204684] [<ffffffffc03b17f7>] bch_ptr_bad+0xa/0xc [bcache] [ 1639.204690] [<ffffffffc03b1646>] bch_btree_iter_next_filter+0x32/0x42 [bcache] [ 1639.204695] [<ffffffffc03b1ce2>] btree_gc_count_keys+0x3b/0x59 [bcache] [ 1639.204701] [<ffffffffc03b5e44>] btree_gc_recurse+0x11b/0x2db [bcache] [ 1639.204705] [<ffffffff8164ad9b>] ? __schedule+0x3b1/0x575 [ 1639.204710] [<ffffffffc03b27e5>] ? __bch_btree_mark_key+0xba/0x1a4 [bcache] [ 1639.204716] [<ffffffffc03b63c9>] bch_btree_gc+0x246/0x3cc [bcache] [ 1639.204722] [<ffffffffc03b63c9>] ? bch_btree_gc+0x246/0x3cc [bcache] [ 1639.204725] [<ffffffff8108d0f8>] ? wake_up_atomic_t+0x2c/0x2c [ 1639.204731] [<ffffffffc03b6586>] bch_gc_thread+0x37/0xea [bcache] [ 1639.204736] [<ffffffffc03b654f>] ? bch_btree_gc+0x3cc/0x3cc [bcache] [ 1639.204741] [<ffffffffc03b654f>] ? bch_btree_gc+0x3cc/0x3cc [bcache] [ 1639.204745] [<ffffffff81075c36>] kthread+0xa5/0xad [ 1639.204747] [<ffffffff81075b91>] ? kthread_parkme+0x24/0x24 [ 1639.204750] [<ffffffff8164decf>] ret_from_fork+0x3f/0x70 [ 1639.204752] [<ffffffff81075b91>] ? kthread_parkme+0x24/0x24 [ 1639.246944] bcache: cache_set_free() Cache set fc8cd783-346b-48f5-a619-fb0380584aa9 unregistered That's not good. What should I do now? Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: every boot gives: bcache/alloc.c:78 WARNING 2016-08-04 3:23 ` Marc MERLIN @ 2016-08-04 4:32 ` Eric Wheeler 0 siblings, 0 replies; 7+ messages in thread From: Eric Wheeler @ 2016-08-04 4:32 UTC (permalink / raw) To: Marc MERLIN; +Cc: linux-bcache On Wed, 3 Aug 2016, Marc MERLIN wrote: > On Wed, Aug 03, 2016 at 07:44:41PM -0700, Eric Wheeler wrote: > > > When you say do "that" do you mean `trigger_gc` ? > > > > > > I think trigger_gc a cache thing, but the whole bcacheN dev might need to > > > be online before it can be triggered (not sure). Backing devices really > > > have metadata, just superblock. > > > > I meant to say: > > > > Backing devices have no metadata, just superblock. > > Mmmh, doing this just killed my cache: > saruman:/sys/block/bcache0# echo 1 > /sys/fs/bcache/7f2e1508-8db6-48cb-85d6-606c88f81f63/internal/trigger_gc > > [ 1639.204612] bcache: error on fc8cd783-346b-48f5-a619-fb0380584aa9: key too stale: 97, need_gc 97, disabling caching > [ 1639.204625] CPU: 7 PID: 519 Comm: bcache_gc Tainted: G W OE 4.4.5-amd64-volpreempt-sysrq-20160312bc5 #10 > [ 1639.204627] Hardware name: LENOVO 20ERCTO1WW/20ERCTO1WW, BIOS N1DET41W (1.15 ) 12/31/2015 > [ 1639.204629] 0000000000000000 ffff8808781fbbc0 ffffffff8134d88e ffff880875040ab8 > [ 1639.204635] ffff88087a3edcd0 ffff8808781fbc00 ffffffffc03b8609 0000000000000001 > [ 1639.204639] ffff880875040ab8 ffffffffc03b17ed ffff88087a3edcd0 ffff8808781fbc50 > [ 1639.204644] Call Trace: > [ 1639.204649] [<ffffffff8134d88e>] dump_stack+0x61/0x7d > [ 1639.204668] [<ffffffffc03b8609>] bch_extent_bad+0xd7/0x12b [bcache] > [ 1639.204677] [<ffffffffc03b17ed>] ? bch_ptr_invalid+0xc/0xc [bcache] > [ 1639.204684] [<ffffffffc03b17f7>] bch_ptr_bad+0xa/0xc [bcache] It seems that you've hit something that might not be a bug. This looks like disk corruption somehow from the looks of the backtrace. Maybe a failed on-SSD writeback flush, writeback controller flush (if any), or just erase block wearout. A quick google shows the last person to have this was in writethrough and rebuilt their cache back in 3.11->3.14: http://www.spinics.net/lists/linux-bcache/msg02450.html This looks like a better thread, possibly implicating TRIM: https://www.mail-archive.com/linux-bcache@vger.kernel.org/msg02720.html If you are writeback, then maybe you could disable gc. I don't think there's a way to disable gc via sysfs, but you could try to comment this out: drivers/md/bcache/super.c: 1669 if (bch_gc_thread_start(c)) 1670 goto err; If it still functions (no idea, it might fail in other unexpected ways), then perhaps you can detach your cache and get it to writeback. > [ 1639.204690] [<ffffffffc03b1646>] bch_btree_iter_next_filter+0x32/0x42 [bcache] > [ 1639.204695] [<ffffffffc03b1ce2>] btree_gc_count_keys+0x3b/0x59 [bcache] > [ 1639.204701] [<ffffffffc03b5e44>] btree_gc_recurse+0x11b/0x2db [bcache] > [ 1639.204705] [<ffffffff8164ad9b>] ? __schedule+0x3b1/0x575 > [ 1639.204710] [<ffffffffc03b27e5>] ? __bch_btree_mark_key+0xba/0x1a4 [bcache] > [ 1639.204716] [<ffffffffc03b63c9>] bch_btree_gc+0x246/0x3cc [bcache] > [ 1639.204722] [<ffffffffc03b63c9>] ? bch_btree_gc+0x246/0x3cc [bcache] > [ 1639.204725] [<ffffffff8108d0f8>] ? wake_up_atomic_t+0x2c/0x2c > [ 1639.204731] [<ffffffffc03b6586>] bch_gc_thread+0x37/0xea [bcache] > [ 1639.204736] [<ffffffffc03b654f>] ? bch_btree_gc+0x3cc/0x3cc [bcache] > [ 1639.204741] [<ffffffffc03b654f>] ? bch_btree_gc+0x3cc/0x3cc [bcache] > [ 1639.204745] [<ffffffff81075c36>] kthread+0xa5/0xad > [ 1639.204747] [<ffffffff81075b91>] ? kthread_parkme+0x24/0x24 > [ 1639.204750] [<ffffffff8164decf>] ret_from_fork+0x3f/0x70 > [ 1639.204752] [<ffffffff81075b91>] ? kthread_parkme+0x24/0x24 > [ 1639.246944] bcache: cache_set_free() Cache set fc8cd783-346b-48f5-a619-fb0380584aa9 unregistered -- Eric Wheeler > > That's not good. > > What should I do now? > > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-08-04 4:37 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-03 23:19 every boot gives: bcache/alloc.c:78 WARNING Marc MERLIN 2016-08-04 0:26 ` Eric Wheeler 2016-08-04 0:33 ` Marc MERLIN 2016-08-04 1:43 ` Eric Wheeler 2016-08-04 2:44 ` Eric Wheeler 2016-08-04 3:23 ` Marc MERLIN 2016-08-04 4:32 ` Eric Wheeler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).