* bcache disabled due to inconsistent ptrs
@ 2013-07-28 22:59 sheng qiu
[not found] ` <CAB7xdimUODWgMC8fuxC_6bRWYja6g5Z+LYSr85obRUUvWsoArQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: sheng qiu @ 2013-07-28 22:59 UTC (permalink / raw)
To: linux-bcache-u79uwXL29TY76Z2rM5mHXA
Hi,
i run filebench's fileserver workload on bcache. It disabled the flash
cache layer in the middle due to the inconsistent ptrs. here is the
error message.
bcache: error on b3f3f4cb-3df2-4564-a45f-18a05f497d78: inconsistent
ptrs: mark = 2, level = 0, disabling caching
[ 2444.047018] Pid: 2930, comm: kworker/u:1 Not tainted 3.9.0+ #196
[ 2444.047019] Call Trace:
[ 2444.047025] [<ffffffff81614bd8>] __bch_btree_mark_key+0x268/0x290
[ 2444.047027] [<ffffffff81614ff5>] btree_gc_mark_node+0x75/0x230
[ 2444.047029] [<ffffffff8161530e>] btree_gc_recurse+0x15e/0x530
[ 2444.047031] [<ffffffff8161b20d>] ? bch_btree_sort_into+0x8d/0xf0
[ 2444.047033] [<ffffffff81617794>] bch_btree_gc+0x594/0x640
[ 2444.047035] [<ffffffff8161e4be>] ? dirty_io_destructor+0xe/0x10
[ 2444.047038] [<ffffffff81077cfb>] process_one_work+0x16b/0x400
[ 2444.047040] [<ffffffff81078988>] worker_thread+0x118/0x360
[ 2444.047042] [<ffffffff81078870>] ? manage_workers+0x350/0x350
[ 2444.047044] [<ffffffff8107df10>] kthread+0xc0/0xd0
[ 2444.047046] [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0
[ 2444.047048] [<ffffffff817adb2c>] ret_from_fork+0x7c/0xb0
[ 2444.047050] [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0
is there any suggestion?
Thanks,
Sheng
--
Sheng Qiu
Texas A & M University
Room 332B Wisenbaker
email: herbert1984106-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
College Station, TX 77843-3259
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <CAB7xdimUODWgMC8fuxC_6bRWYja6g5Z+LYSr85obRUUvWsoArQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: bcache disabled due to inconsistent ptrs [not found] ` <CAB7xdimUODWgMC8fuxC_6bRWYja6g5Z+LYSr85obRUUvWsoArQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-07-28 23:37 ` Kent Overstreet [not found] ` <20130728233757.GB6653-jC9Py7bek1znysI04z7BkA@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Kent Overstreet @ 2013-07-28 23:37 UTC (permalink / raw) To: sheng qiu; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA On Sun, Jul 28, 2013 at 06:59:10PM -0400, sheng qiu wrote: > Hi, > > i run filebench's fileserver workload on bcache. It disabled the flash > cache layer in the middle due to the inconsistent ptrs. here is the > error message. > > bcache: error on b3f3f4cb-3df2-4564-a45f-18a05f497d78: inconsistent > ptrs: mark = 2, level = 0, disabling caching > [ 2444.047018] Pid: 2930, comm: kworker/u:1 Not tainted 3.9.0+ #196 > [ 2444.047019] Call Trace: > [ 2444.047025] [<ffffffff81614bd8>] __bch_btree_mark_key+0x268/0x290 > [ 2444.047027] [<ffffffff81614ff5>] btree_gc_mark_node+0x75/0x230 > [ 2444.047029] [<ffffffff8161530e>] btree_gc_recurse+0x15e/0x530 > [ 2444.047031] [<ffffffff8161b20d>] ? bch_btree_sort_into+0x8d/0xf0 > [ 2444.047033] [<ffffffff81617794>] bch_btree_gc+0x594/0x640 > [ 2444.047035] [<ffffffff8161e4be>] ? dirty_io_destructor+0xe/0x10 > [ 2444.047038] [<ffffffff81077cfb>] process_one_work+0x16b/0x400 > [ 2444.047040] [<ffffffff81078988>] worker_thread+0x118/0x360 > [ 2444.047042] [<ffffffff81078870>] ? manage_workers+0x350/0x350 > [ 2444.047044] [<ffffffff8107df10>] kthread+0xc0/0xd0 > [ 2444.047046] [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 > [ 2444.047048] [<ffffffff817adb2c>] ret_from_fork+0x7c/0xb0 > [ 2444.047050] [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 > > is there any suggestion? Erg.. What branch are you running? ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20130728233757.GB6653-jC9Py7bek1znysI04z7BkA@public.gmane.org>]
* Re: bcache disabled due to inconsistent ptrs [not found] ` <20130728233757.GB6653-jC9Py7bek1znysI04z7BkA@public.gmane.org> @ 2013-07-28 23:47 ` sheng qiu [not found] ` <CAB7xdik9ZEsj8+s+DB9GDqEZAcLTStyk+mf7Yiw7uazAXjz1MA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: sheng qiu @ 2013-07-28 23:47 UTC (permalink / raw) To: Kent Overstreet; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi Kent, i get the bcache codes from git clone http://evilpiepirate.org/git/linux-bcache.git. Thanks, Sheng On Sun, Jul 28, 2013 at 6:37 PM, Kent Overstreet <kmo-PEzghdH756F8UrSeD/g0lQ@public.gmane.org> wrote: > On Sun, Jul 28, 2013 at 06:59:10PM -0400, sheng qiu wrote: >> Hi, >> >> i run filebench's fileserver workload on bcache. It disabled the flash >> cache layer in the middle due to the inconsistent ptrs. here is the >> error message. >> >> bcache: error on b3f3f4cb-3df2-4564-a45f-18a05f497d78: inconsistent >> ptrs: mark = 2, level = 0, disabling caching >> [ 2444.047018] Pid: 2930, comm: kworker/u:1 Not tainted 3.9.0+ #196 >> [ 2444.047019] Call Trace: >> [ 2444.047025] [<ffffffff81614bd8>] __bch_btree_mark_key+0x268/0x290 >> [ 2444.047027] [<ffffffff81614ff5>] btree_gc_mark_node+0x75/0x230 >> [ 2444.047029] [<ffffffff8161530e>] btree_gc_recurse+0x15e/0x530 >> [ 2444.047031] [<ffffffff8161b20d>] ? bch_btree_sort_into+0x8d/0xf0 >> [ 2444.047033] [<ffffffff81617794>] bch_btree_gc+0x594/0x640 >> [ 2444.047035] [<ffffffff8161e4be>] ? dirty_io_destructor+0xe/0x10 >> [ 2444.047038] [<ffffffff81077cfb>] process_one_work+0x16b/0x400 >> [ 2444.047040] [<ffffffff81078988>] worker_thread+0x118/0x360 >> [ 2444.047042] [<ffffffff81078870>] ? manage_workers+0x350/0x350 >> [ 2444.047044] [<ffffffff8107df10>] kthread+0xc0/0xd0 >> [ 2444.047046] [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 >> [ 2444.047048] [<ffffffff817adb2c>] ret_from_fork+0x7c/0xb0 >> [ 2444.047050] [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 >> >> is there any suggestion? > > Erg.. What branch are you running? -- Sheng Qiu Texas A & M University Room 332B Wisenbaker email: herbert1984106-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org College Station, TX 77843-3259 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <CAB7xdik9ZEsj8+s+DB9GDqEZAcLTStyk+mf7Yiw7uazAXjz1MA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: bcache disabled due to inconsistent ptrs [not found] ` <CAB7xdik9ZEsj8+s+DB9GDqEZAcLTStyk+mf7Yiw7uazAXjz1MA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-07-28 23:52 ` Kent Overstreet [not found] ` <20130728235259.GC6653-jC9Py7bek1znysI04z7BkA@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Kent Overstreet @ 2013-07-28 23:52 UTC (permalink / raw) To: sheng qiu; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA On Sun, Jul 28, 2013 at 06:47:43PM -0500, sheng qiu wrote: > Hi Kent, > > i get the bcache codes from git clone > http://evilpiepirate.org/git/linux-bcache.git. Can you reproduce it? How long did it take to trigger? ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20130728235259.GC6653-jC9Py7bek1znysI04z7BkA@public.gmane.org>]
* Re: bcache disabled due to inconsistent ptrs [not found] ` <20130728235259.GC6653-jC9Py7bek1znysI04z7BkA@public.gmane.org> @ 2013-07-28 23:57 ` sheng qiu [not found] ` <CAB7xdimynY=k1e9+jMM1Fwm1LmY=n_xZQS+zKW5VZpPDa_2jGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: sheng qiu @ 2013-07-28 23:57 UTC (permalink / raw) To: Kent Overstreet; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA generally the flash size is 40GB, RAM is 8GB. The workload first writes about 65GB of data and then do read/writes (10 threads) on those data. it runs less than 1 hour during the read/write step and nearly happened each time. now i was testing with less threads and see if this is the case. Thanks, Sheng On Sun, Jul 28, 2013 at 6:52 PM, Kent Overstreet <kmo-PEzghdH756F8UrSeD/g0lQ@public.gmane.org> wrote: > On Sun, Jul 28, 2013 at 06:47:43PM -0500, sheng qiu wrote: >> Hi Kent, >> >> i get the bcache codes from git clone >> http://evilpiepirate.org/git/linux-bcache.git. > > Can you reproduce it? How long did it take to trigger? -- Sheng Qiu Texas A & M University Room 332B Wisenbaker email: herbert1984106-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org College Station, TX 77843-3259 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <CAB7xdimynY=k1e9+jMM1Fwm1LmY=n_xZQS+zKW5VZpPDa_2jGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: bcache disabled due to inconsistent ptrs [not found] ` <CAB7xdimynY=k1e9+jMM1Fwm1LmY=n_xZQS+zKW5VZpPDa_2jGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-07-29 19:13 ` Kent Overstreet 2013-08-01 3:14 ` sheng qiu 0 siblings, 1 reply; 7+ messages in thread From: Kent Overstreet @ 2013-07-29 19:13 UTC (permalink / raw) To: sheng qiu; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA On Sun, Jul 28, 2013 at 06:57:42PM -0500, sheng qiu wrote: > generally the flash size is 40GB, RAM is 8GB. The workload first > writes about 65GB of data and then do read/writes (10 threads) on > those data. > it runs less than 1 hour during the read/write step and nearly > happened each time. now i was testing with less threads and see if > this is the case. Some kind of garbage collection bug... :/ Could you try the bcache-testing branch? It's got significantly improved garbage collection code - if it's not fixed there, at least that code should be easier to debug... ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bcache disabled due to inconsistent ptrs 2013-07-29 19:13 ` Kent Overstreet @ 2013-08-01 3:14 ` sheng qiu 0 siblings, 0 replies; 7+ messages in thread From: sheng qiu @ 2013-08-01 3:14 UTC (permalink / raw) To: Kent Overstreet; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA Hi Kent, i find something happened when i set a large bucket size. i.e. bucket_size >= 2mb. when cache become full, 1% cache_available_percent. the new data keeps skipping the cache, and the cache will remain in that state. by default, the movinggc is disabled (copy_enabled = 0), so we might not get free buckets by cleaning up invalid data. since data skips cache under such high utilization, no insertion no new allocation, won't cause the replacement (invalidate buckets) on existing buckets. is this the correct logic that suppose to happen? or did i miss anything? Thanks, Sheng On Mon, Jul 29, 2013 at 3:13 PM, Kent Overstreet <kmo-PEzghdH756F8UrSeD/g0lQ@public.gmane.org> wrote: > On Sun, Jul 28, 2013 at 06:57:42PM -0500, sheng qiu wrote: >> generally the flash size is 40GB, RAM is 8GB. The workload first >> writes about 65GB of data and then do read/writes (10 threads) on >> those data. >> it runs less than 1 hour during the read/write step and nearly >> happened each time. now i was testing with less threads and see if >> this is the case. > > Some kind of garbage collection bug... :/ > > Could you try the bcache-testing branch? It's got significantly improved > garbage collection code - if it's not fixed there, at least that code > should be easier to debug... -- Sheng Qiu Texas A & M University Room 332B Wisenbaker email: herbert1984106-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org College Station, TX 77843-3259 ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-08-01 3:14 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-28 22:59 bcache disabled due to inconsistent ptrs sheng qiu
[not found] ` <CAB7xdimUODWgMC8fuxC_6bRWYja6g5Z+LYSr85obRUUvWsoArQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-07-28 23:37 ` Kent Overstreet
[not found] ` <20130728233757.GB6653-jC9Py7bek1znysI04z7BkA@public.gmane.org>
2013-07-28 23:47 ` sheng qiu
[not found] ` <CAB7xdik9ZEsj8+s+DB9GDqEZAcLTStyk+mf7Yiw7uazAXjz1MA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-07-28 23:52 ` Kent Overstreet
[not found] ` <20130728235259.GC6653-jC9Py7bek1znysI04z7BkA@public.gmane.org>
2013-07-28 23:57 ` sheng qiu
[not found] ` <CAB7xdimynY=k1e9+jMM1Fwm1LmY=n_xZQS+zKW5VZpPDa_2jGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-07-29 19:13 ` Kent Overstreet
2013-08-01 3:14 ` sheng qiu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox