bcache hangs on writes, recovers after disabling discard on cache device

public inbox for linux-bcache@vger.kernel.org
 help / color / mirror / Atom feed

* bcache hangs on writes, recovers after disabling discard on cache device
@ 2013-06-24 18:36 Heiko Wundram
       [not found] ` <51C891C9.4060809-EqIAFqbRPK3NLxjTenLetw@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Heiko Wundram @ 2013-06-24 18:36 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA

Hey all!

In a rather old thread of mine (514C4FC1.6090804-EqIAFqbRPK3NLxjTenLetw@public.gmane.org), I 
already stated a problem I was seeing back at the time, namely that 
bcache seems to hang on writes to the SSD device after some time, 
writing 50 MB/s continuously to the output device. Back at that time I 
thought that the problem was resolved by compiling the bcache module 
statically into the kernel, but it seems this was not the case. With a 
slightly different workload which I tested this afternoon (and which 
made me decide to write the email from earlier about "doubling" of the 
I/O bandwidth when discard is active), I managed to get the bcache 
device to hang again with the same symptoms: continuous write I/O of 50 
MB/s to the cache device without any I/O being accepted on the bcache1 
device.

Statistics I'm seeing:

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00    0,00  100,00     0,00    50,00 
1024,00     1,00   10,00    0,00   10,00  10,00 100,00
md2               0,00     0,00    0,00    0,00     0,00     0,00 
0,00     0,00    0,00    0,00    0,00   0,00   0,00
bcache1           0,00     0,00    0,00    0,00     0,00     0,00 
0,00     0,00    0,00    0,00    0,00   0,00   0,00

(where sda is the SSD-device in a cache set and md2 is the backing device).

While the bcache device is in this state, I'm seeing the following 
output in dmesg (this is from a kernel which has bcache compiled as a 
module, but I got the same output from the statically compiled bcache):

[23866.442387] INFO: task jbd2/bcache1-8:11826 blocked for more than 120 
seconds.
[23866.442453] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[23866.442518] jbd2/bcache1-8  D ffff88081fb14040     0 11826      2 
0x00000000
[23866.442523]  ffff8807feb6c800 0000000000000046 000112001b8a61c0 
ffff88081bf0e080
[23866.442528]  0000000000014040 ffff8804f9191fd8 ffff8804f9191fd8 
ffff8807feb6c800
[23866.442531]  0000000000011200 ffff8807fec08ad0 ffff8807feb6c800 
0000000000000000
[23866.442535] Call Trace:
[23866.442544]  [<ffffffff8139617a>] ? rwsem_down_read_failed+0xc8/0xde
[23866.442550]  [<ffffffff811ce814>] ? call_rwsem_down_read_failed+0x14/0x30
[23866.442553]  [<ffffffff81394ff3>] ? down_read+0x17/0x19
[23866.442565]  [<ffffffffa04c135b>] ? request_write+0x7e/0x2c8 [bcache]
[23866.442569]  [<ffffffff811ab87f>] ? generic_make_request+0x96/0xd5
[23866.442572]  [<ffffffff811ac48a>] ? submit_bio+0x10a/0x13b
[23866.442577]  [<ffffffff811301e4>] ? _submit_bh+0x1b9/0x1d7
[23866.442594]  [<ffffffffa015cc45>] ? 
jbd2_journal_commit_transaction+0x9a3/0x138e [jbd2]
[23866.442601]  [<ffffffff81048129>] ? lock_timer_base.isra.35+0x23/0x48
[23866.442605]  [<ffffffff813964db>] ? _raw_spin_lock_irqsave+0x14/0x35
[23866.442614]  [<ffffffffa01616b3>] ? kjournald2+0xb7/0x239 [jbd2]
[23866.442619]  [<ffffffff81058643>] ? abort_exclusive_wait+0x79/0x79
[23866.442627]  [<ffffffffa01615fc>] ? jbd2_superblock_csum+0x35/0x35 [jbd2]
[23866.442635]  [<ffffffffa01615fc>] ? jbd2_superblock_csum+0x35/0x35 [jbd2]
[23866.442638]  [<ffffffff81057ca9>] ? kthread+0x81/0x89
[23866.442641]  [<ffffffff81057c28>] ? __kthread_parkme+0x5d/0x5d
[23866.442645]  [<ffffffff8139b83c>] ? ret_from_fork+0x7c/0xb0
[23866.442648]  [<ffffffff81057c28>] ? __kthread_parkme+0x5d/0x5d
[23866.442653] INFO: task qemu-system-x86:13253 blocked for more than 
120 seconds.
[23866.442713] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[23866.442777] qemu-system-x86 D ffff88081fa14040     0 13253      1 
0x00000000
[23866.442780]  ffff88040c317880 0000000000000082 0001120076ecd958 
ffffffff81613400
[23866.442784]  0000000000014040 ffff880176ecdfd8 ffff880176ecdfd8 
ffff88040c317880
[23866.442787]  0000000000011200 ffff8807fec08ad0 ffff88040c317880 
0000000000000000
[23866.442791] Call Trace:
[23866.442795]  [<ffffffff8139617a>] ? rwsem_down_read_failed+0xc8/0xde
[23866.442799]  [<ffffffff811ce814>] ? call_rwsem_down_read_failed+0x14/0x30
[23866.442802]  [<ffffffff81394ff3>] ? down_read+0x17/0x19
[23866.442809]  [<ffffffffa04c135b>] ? request_write+0x7e/0x2c8 [bcache]
[23866.442812]  [<ffffffff811ab87f>] ? generic_make_request+0x96/0xd5
[23866.442815]  [<ffffffff811ac48a>] ? submit_bio+0x10a/0x13b
[23866.442820]  [<ffffffff81135a87>] ? dio_new_bio.isra.10+0xc9/0x114
[23866.442833]  [<ffffffffa0175d6a>] ? _ext4_get_block+0x12f/0x149 [ext4]
[23866.442837]  [<ffffffff81135bc5>] ? dio_bio_submit+0x68/0x88
[23866.442841]  [<ffffffff81136971>] ? do_blockdev_direct_IO+0x957/0xae8
[23866.442846]  [<ffffffff81060245>] ? should_resched+0x5/0x23
[23866.442858]  [<ffffffffa0175d8e>] ? 
ext4_get_block_write_nolock+0xa/0xa [ext4]
[23866.442870]  [<ffffffffa017450a>] ? ext4_direct_IO+0x1d6/0x324 [ext4]
[23866.442880]  [<ffffffffa0175d8e>] ? 
ext4_get_block_write_nolock+0xa/0xa [ext4]
[23866.442890]  [<ffffffffa0172c20>] ? pagevec_release+0xb/0xb [ext4]
[23866.442894]  [<ffffffff810c312a>] ? generic_file_direct_write+0xe3/0x14a
[23866.442898]  [<ffffffff810c32ac>] ? __generic_file_aio_write+0x11b/0x1ff
[23866.442902]  [<ffffffff81060245>] ? should_resched+0x5/0x23
[23866.442912]  [<ffffffffa01706ae>] ? ext4_file_write+0x2b3/0x365 [ext4]
[23866.442917]  [<ffffffff8110d550>] ? __sb_start_write+0xb6/0xe8
[23866.442926]  [<ffffffffa01703fb>] ? ext4_unwritten_wait+0x98/0x98 [ext4]
[23866.442930]  [<ffffffff811413f7>] ? do_io_submit+0x384/0x5e4
[23866.442934]  [<ffffffff8139b8e9>] ? system_call_fastpath+0x16/0x1b
[23866.442938] INFO: task qemu-system-x86:13379 blocked for more than 
120 seconds.
[23866.442997] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[23866.443068] qemu-system-x86 D ffff88081fa94040     0 13379      1 
0x00000000
[23866.443073]  ffff88056fcbd7c0 0000000000000082 ffff88049862ffd8 
ffff88081bf0c040
[23866.443087]  0000000000014040 ffff88049862ffd8 ffff88049862ffd8 
ffff88056fcbd7c0
[23866.443092]  ffffffff8139649c ffff88055990af28 ffff8803d6efa678 
ffff8805ef7f5280
[23866.443098] Call Trace:
[23866.443104]  [<ffffffff8139649c>] ? _raw_spin_unlock_irqrestore+0xc/0xd
[23866.443117]  [<ffffffffa015b997>] ? do_get_write_access+0x219/0x402 
[jbd2]
[23866.443122]  [<ffffffff813964db>] ? _raw_spin_lock_irqsave+0x14/0x35
[23866.443127]  [<ffffffff81047fa1>] ? internal_add_timer+0xd/0x28
[23866.443131]  [<ffffffff81060245>] ? should_resched+0x5/0x23
[23866.443135]  [<ffffffff8105866d>] ? autoremove_wake_function+0x2a/0x2a
[23866.443148]  [<ffffffffa0178cd3>] ? ext4_dirty_inode+0x34/0x4f [ext4]
[23866.443155]  [<ffffffffa015bc74>] ? 
jbd2_journal_get_write_access+0x21/0x38 [jbd2]
[23866.443162]  [<ffffffffa015a667>] ? start_this_handle+0x453/0x46b [jbd2]
[23866.443166]  [<ffffffff81060245>] ? should_resched+0x5/0x23
[23866.443181]  [<ffffffffa01967a1>] ? 
__ext4_journal_get_write_access+0x58/0x69 [ext4]
[23866.443192]  [<ffffffffa0176691>] ? 
ext4_reserve_inode_write+0x37/0x7a [ext4]
[23866.443198]  [<ffffffff810fc59c>] ? kmem_cache_alloc+0x8d/0xff
[23866.443208]  [<ffffffffa0176737>] ? ext4_mark_inode_dirty+0x63/0x1f3 
[ext4]
[23866.443219]  [<ffffffffa0178cd3>] ? ext4_dirty_inode+0x34/0x4f [ext4]
[23866.443229]  [<ffffffffa0178c9f>] ? ext4_evict_inode+0x2aa/0x2aa [ext4]
[23866.443233]  [<ffffffff811299f1>] ? __mark_inode_dirty+0x56/0x1ef
[23866.443237]  [<ffffffff8111dc23>] ? update_time+0xa0/0xa9
[23866.443241]  [<ffffffff811194ca>] ? set_restore_sigmask+0x2d/0x2d
[23866.443245]  [<ffffffff8111e455>] ? file_update_time+0x95/0xb5
[23866.443248]  [<ffffffff810c326e>] ? __generic_file_aio_write+0xdd/0x1ff
[23866.443252]  [<ffffffff81060245>] ? should_resched+0x5/0x23
[23866.443262]  [<ffffffffa01706ae>] ? ext4_file_write+0x2b3/0x365 [ext4]
[23866.443266]  [<ffffffff8110d550>] ? __sb_start_write+0xb6/0xe8
[23866.443276]  [<ffffffffa01703fb>] ? ext4_unwritten_wait+0x98/0x98 [ext4]
[23866.443279]  [<ffffffff811413f7>] ? do_io_submit+0x384/0x5e4
[23866.443283]  [<ffffffff8139b8e9>] ? system_call_fastpath+0x16/0x1b
[23866.443287] INFO: task qemu-system-x86:13511 blocked for more than 
120 seconds.
[23866.443347] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[23866.443411] qemu-system-x86 D ffff88081fa94040     0 13511      1 
0x00000000
[23866.443414]  ffff8807e45f90c0 0000000000000082 000112007753f958 
ffff88081bf0c040
[23866.443417]  0000000000014040 ffff88017753ffd8 ffff88017753ffd8 
ffff8807e45f90c0
[23866.443420]  0000000000011200 ffff8807fec08ad0 ffff8807e45f90c0 
0000000000000000
[23866.443424] Call Trace:
[23866.443428]  [<ffffffff8139617a>] ? rwsem_down_read_failed+0xc8/0xde
[23866.443432]  [<ffffffff811ce814>] ? call_rwsem_down_read_failed+0x14/0x30
[23866.443435]  [<ffffffff81394ff3>] ? down_read+0x17/0x19
[23866.443442]  [<ffffffffa04c135b>] ? request_write+0x7e/0x2c8 [bcache]
[23866.443446]  [<ffffffff811ab87f>] ? generic_make_request+0x96/0xd5
[23866.443449]  [<ffffffff811ac48a>] ? submit_bio+0x10a/0x13b
[23866.443453]  [<ffffffff81135dbe>] ? dio_bio_add_page+0x36/0x4c
[23866.443457]  [<ffffffff81135e22>] ? dio_send_cur_page+0x4e/0xa8
[23866.443461]  [<ffffffff810cc08f>] ? put_page+0x18/0x27
[23866.443465]  [<ffffffff81135bc5>] ? dio_bio_submit+0x68/0x88
[23866.443469]  [<ffffffff81136971>] ? do_blockdev_direct_IO+0x957/0xae8
[23866.443472]  [<ffffffff81060245>] ? should_resched+0x5/0x23
[23866.443484]  [<ffffffffa0175d8e>] ? 
ext4_get_block_write_nolock+0xa/0xa [ext4]
[23866.443495]  [<ffffffffa017450a>] ? ext4_direct_IO+0x1d6/0x324 [ext4]
[23866.443505]  [<ffffffffa0175d8e>] ? 
ext4_get_block_write_nolock+0xa/0xa [ext4]
[23866.443515]  [<ffffffffa0172c20>] ? pagevec_release+0xb/0xb [ext4]
[23866.443518]  [<ffffffff810c312a>] ? generic_file_direct_write+0xe3/0x14a
[23866.443522]  [<ffffffff810c32ac>] ? __generic_file_aio_write+0x11b/0x1ff
[23866.443526]  [<ffffffff81060245>] ? should_resched+0x5/0x23
[23866.443535]  [<ffffffffa01706ae>] ? ext4_file_write+0x2b3/0x365 [ext4]
[23866.443539]  [<ffffffff8110d550>] ? __sb_start_write+0xb6/0xe8
[23866.443549]  [<ffffffffa01703fb>] ? ext4_unwritten_wait+0x98/0x98 [ext4]
[23866.443552]  [<ffffffff811413f7>] ? do_io_submit+0x384/0x5e4
[23866.443556]  [<ffffffff8139b8e9>] ? system_call_fastpath+0x16/0x1b
[23866.443561] INFO: task qemu-system-x86:17124 blocked for more than 
120 seconds.
[23866.443620] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[23866.443683] qemu-system-x86 D ffff88081fa94040     0 17124      1 
0x00000000
[23866.443686]  ffff88081b837780 0000000000000082 ffff8807fbf94440 
ffff88081bf0c040
[23866.443689]  0000000000014040 ffff8807fd3f5fd8 ffff8807fd3f5fd8 
ffff88081b837780
[23866.443692]  ffffffff810fbfe7 ffff8807fec08ad0 ffff88081b837780 
0000000000000000
[23866.443696] Call Trace:
[23866.443700]  [<ffffffff810fbfe7>] ? __kmalloc+0xde/0xf0
[23866.443703]  [<ffffffff8139617a>] ? rwsem_down_read_failed+0xc8/0xde
[23866.443707]  [<ffffffff811ce814>] ? call_rwsem_down_read_failed+0x14/0x30
[23866.443710]  [<ffffffff81394ff3>] ? down_read+0x17/0x19
[23866.443717]  [<ffffffffa04c135b>] ? request_write+0x7e/0x2c8 [bcache]
[23866.443720]  [<ffffffff811ab87f>] ? generic_make_request+0x96/0xd5
[23866.443723]  [<ffffffff811ac48a>] ? submit_bio+0x10a/0x13b
[23866.443727]  [<ffffffff81135dbe>] ? dio_bio_add_page+0x36/0x4c
[23866.443731]  [<ffffffff81135e22>] ? dio_send_cur_page+0x4e/0xa8
[23866.443734]  [<ffffffff810cc08f>] ? put_page+0x18/0x27
[23866.443738]  [<ffffffff81135bc5>] ? dio_bio_submit+0x68/0x88
[23866.443742]  [<ffffffff81136971>] ? do_blockdev_direct_IO+0x957/0xae8
[23866.443746]  [<ffffffff81060245>] ? should_resched+0x5/0x23
[23866.443757]  [<ffffffffa0175d8e>] ? 
ext4_get_block_write_nolock+0xa/0xa [ext4]
[23866.443767]  [<ffffffffa017450a>] ? ext4_direct_IO+0x1d6/0x324 [ext4]
[23866.443777]  [<ffffffffa0175d8e>] ? 
ext4_get_block_write_nolock+0xa/0xa [ext4]
[23866.443787]  [<ffffffffa0172c20>] ? pagevec_release+0xb/0xb [ext4]
[23866.443790]  [<ffffffff810c312a>] ? generic_file_direct_write+0xe3/0x14a
[23866.443794]  [<ffffffff810c32ac>] ? __generic_file_aio_write+0x11b/0x1ff
[23866.443797]  [<ffffffff81060245>] ? should_resched+0x5/0x23
[23866.443807]  [<ffffffffa01706ae>] ? ext4_file_write+0x2b3/0x365 [ext4]
[23866.443811]  [<ffffffff8110d550>] ? __sb_start_write+0xb6/0xe8
[23866.443821]  [<ffffffffa01703fb>] ? ext4_unwritten_wait+0x98/0x98 [ext4]
[23866.443824]  [<ffffffff811413f7>] ? do_io_submit+0x384/0x5e4
[23866.443827]  [<ffffffff8139b8e9>] ? system_call_fastpath+0x16/0x1b
[23866.443833] INFO: task kworker/7:1:23798 blocked for more than 120 
seconds.
[23866.443889] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[23866.443952] kworker/7:1     D ffff88081fbd4040     0 23798      2 
0x00000000
[23866.443961] Workqueue: events update_writeback_rate [bcache]
[23866.443962]  ffff8807fc485140 0000000000000046 ffff88081fbd4040 
ffff88081bf11840
[23866.443966]  0000000000014040 ffff880403119fd8 ffff880403119fd8 
ffff8807fc485140
[23866.443969]  0000000000014040 ffff8807fec08ad0 ffff8807fc485140 
0000000000000000
[23866.443972] Call Trace:
[23866.443976]  [<ffffffff8139617a>] ? rwsem_down_read_failed+0xc8/0xde
[23866.443980]  [<ffffffff811ce814>] ? call_rwsem_down_read_failed+0x14/0x30
[23866.443983]  [<ffffffff81394ff3>] ? down_read+0x17/0x19
[23866.443989]  [<ffffffffa04bdbc0>] ? update_writeback_rate+0x1a/0x187 
[bcache]
[23866.443994]  [<ffffffff81053300>] ? process_one_work+0x191/0x28f
[23866.443998]  [<ffffffff810537ad>] ? worker_thread+0x121/0x1e7
[23866.444001]  [<ffffffff8105368c>] ? rescuer_thread+0x269/0x269
[23866.444004]  [<ffffffff81057ca9>] ? kthread+0x81/0x89
[23866.444007]  [<ffffffff81057c28>] ? __kthread_parkme+0x5d/0x5d
[23866.444010]  [<ffffffff8139b83c>] ? ret_from_fork+0x7c/0xb0
[23866.444013]  [<ffffffff81057c28>] ? __kthread_parkme+0x5d/0x5d
[23866.444018] INFO: task kworker/u16:3:24718 blocked for more than 120 
seconds.
[23866.444076] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[23866.444139] kworker/u16:3   D ffff88081fa54040     0 24718      2 
0x00000000
[23866.444147] Workqueue: bcache_writeback read_dirty [bcache]
[23866.444148]  ffff8807fbc4e140 0000000000000046 ffff8807fec08010 
ffff8807e4994040
[23866.444151]  0000000000014040 ffff88050282bfd8 ffff88050282bfd8 
ffff8807fbc4e140
[23866.444155]  0000000000000003 ffff8807fec08ad0 ffff8807fbc4e140 
ffffffff00000001
[23866.444158] Call Trace:
[23866.444162]  [<ffffffff8139629c>] ? rwsem_down_write_failed+0x10c/0x18e
[23866.444165]  [<ffffffff81396465>] ? _raw_spin_lock_irq+0xb/0x15
[23866.444169]  [<ffffffff811ce843>] ? 
call_rwsem_down_write_failed+0x13/0x20
[23866.444172]  [<ffffffff81395019>] ? down_write+0x24/0x26
[23866.444178]  [<ffffffffa04be1f3>] ? refill_dirty+0x5d/0x1f2 [bcache]
[23866.444185]  [<ffffffffa04be6a5>] ? read_dirty+0x31d/0x369 [bcache]
[23866.444189]  [<ffffffff8100c02f>] ? load_TLS+0x7/0xa
[23866.444193]  [<ffffffff81053300>] ? process_one_work+0x191/0x28f
[23866.444197]  [<ffffffff810537ad>] ? worker_thread+0x121/0x1e7
[23866.444200]  [<ffffffff8105368c>] ? rescuer_thread+0x269/0x269
[23866.444203]  [<ffffffff81057ca9>] ? kthread+0x81/0x89
[23866.444207]  [<ffffffff81057c28>] ? __kthread_parkme+0x5d/0x5d
[23866.444209]  [<ffffffff8139b83c>] ? ret_from_fork+0x7c/0xb0
[23866.444213]  [<ffffffff81057c28>] ? __kthread_parkme+0x5d/0x5d

Disabling discard on the SSD device (by echo 0 > 
/sys/fs/bcache/.../cache0/discard) makes the bcache device become 
responsive again.

Possibly these hints help at diagnosing the problem at hand; if there's 
any more request for information, feel free to mail me. I'll keep 
pounding on the server now that discard is off and try to reproduce the 
problem, but from what I can see now, on the machine that I originally 
thought had the problem fixed by statically compiling bcache, discards 
are and have always been off for the SSD device. So most probably, that 
was also the cause for that "miracle" cure.

Thanks for your time.

-- 
--- Heiko.

^ permalink raw reply	[flat|nested] 11+ messages in thread

[parent not found: <51C891C9.4060809-EqIAFqbRPK3NLxjTenLetw@public.gmane.org>]

* Re: bcache hangs on writes, recovers after disabling discard on cache device
       [not found] ` <51C891C9.4060809-EqIAFqbRPK3NLxjTenLetw@public.gmane.org>
@ 2013-07-11 14:58   ` Juha Aatrokoski
       [not found]     ` <alpine.DEB.2.02.1307111748130.26927-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Juha Aatrokoski @ 2013-07-11 14:58 UTC (permalink / raw)
  To: Heiko Wundram; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Mon, 24 Jun 2013, Heiko Wundram wrote:

> In a rather old thread of mine (514C4FC1.6090804-EqIAFqbRPK3NLxjTenLetw@public.gmane.org), I already 
> stated a problem I was seeing back at the time, namely that bcache seems to 
> hang on writes to the SSD device after some time, writing 50 MB/s continuously 
> to the output device.
...
> Disabling discard on the SSD device (by echo 0 > 
> /sys/fs/bcache/.../cache0/discard) makes the bcache device become responsive 
> again.

I can confirm this. On Monday I booted into kernel 3.8.13-gentoo with the 
current "bcache" branch patched onto it. Discard was on for the cache 
device, and today this 50MB/s hang occured. Disabling discard did indeed 
fix it.

I had CONFIG_BCACHE_CLOSURES_DEBUG=y in the kernel, and the contents of 
/sys/kernel/debug/bcache/bcache-UUID and /sys/kernel/debug/closures both 
before and after disabling discard can be found at the address below, in 
case they can help in determining where exactly the hang happens.

http://www.metsahovi.fi/~jha/bcache/

^ permalink raw reply	[flat|nested] 11+ messages in thread

[parent not found: <alpine.DEB.2.02.1307111748130.26927-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>]

* Re: bcache hangs on writes, recovers after disabling discard on cache device
       [not found]     ` <alpine.DEB.2.02.1307111748130.26927-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
@ 2013-07-12  1:15       ` Kent Overstreet
  2013-07-12 16:13         ` Juha Aatrokoski
  0 siblings, 1 reply; 11+ messages in thread
From: Kent Overstreet @ 2013-07-12  1:15 UTC (permalink / raw)
  To: Juha Aatrokoski; +Cc: Heiko Wundram, linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Thu, Jul 11, 2013 at 05:58:17PM +0300, Juha Aatrokoski wrote:
> On Mon, 24 Jun 2013, Heiko Wundram wrote:
> 
> >In a rather old thread of mine (514C4FC1.6090804-EqIAFqbRPK3NLxjTenLetw@public.gmane.org), I
> >already stated a problem I was seeing back at the time, namely
> >that bcache seems to hang on writes to the SSD device after some
> >time, writing 50 MB/s continuously to the output device.
> ...
> >Disabling discard on the SSD device (by echo 0 >
> >/sys/fs/bcache/.../cache0/discard) makes the bcache device become
> >responsive again.
> 
> I can confirm this. On Monday I booted into kernel 3.8.13-gentoo
> with the current "bcache" branch patched onto it. Discard was on for
> the cache device, and today this 50MB/s hang occured. Disabling
> discard did indeed fix it.
> 
> I had CONFIG_BCACHE_CLOSURES_DEBUG=y in the kernel, and the contents
> of /sys/kernel/debug/bcache/bcache-UUID and
> /sys/kernel/debug/closures both before and after disabling discard
> can be found at the address below, in case they can help in
> determining where exactly the hang happens.

Can you give this patch a try? It's on top of the current
bcache-for-3.11 branch


From 265c5caff798b307bcaced08add46c91c3d4fdd1 Mon Sep 17 00:00:00 2001
From: Kent Overstreet <kmo-PEzghdH756F8UrSeD/g0lQ@public.gmane.org>
Date: Thu, 11 Jul 2013 18:14:13 -0700
Subject: [PATCH] bcache: Use blkdev_issue_discard()


diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index e45f557..7a12dd1 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -63,13 +63,12 @@
 #include "bcache.h"
 #include "btree.h"
 
+#include <linux/blkdev.h>
 #include <linux/freezer.h>
 #include <linux/kthread.h>
 #include <linux/random.h>
 #include <trace/events/bcache.h>
 
-#define MAX_IN_FLIGHT_DISCARDS		8U
-
 /* Bucket heap / gen */
 
 uint8_t bch_inc_gen(struct cache *ca, struct bucket *b)
@@ -121,75 +120,6 @@ void bch_rescale_priorities(struct cache_set *c, int sectors)
 	mutex_unlock(&c->bucket_lock);
 }
 
-/* Discard/TRIM */
-
-struct discard {
-	struct list_head	list;
-	struct work_struct	work;
-	struct cache		*ca;
-	long			bucket;
-
-	struct bio		bio;
-	struct bio_vec		bv;
-};
-
-static void discard_finish(struct work_struct *w)
-{
-	struct discard *d = container_of(w, struct discard, work);
-	struct cache *ca = d->ca;
-	char buf[BDEVNAME_SIZE];
-
-	if (!test_bit(BIO_UPTODATE, &d->bio.bi_flags)) {
-		pr_notice("discard error on %s, disabling",
-			 bdevname(ca->bdev, buf));
-		d->ca->discard = 0;
-	}
-
-	mutex_lock(&ca->set->bucket_lock);
-
-	fifo_push(&ca->free, d->bucket);
-	list_add(&d->list, &ca->discards);
-	atomic_dec(&ca->discards_in_flight);
-
-	mutex_unlock(&ca->set->bucket_lock);
-
-	closure_wake_up(&ca->set->bucket_wait);
-	wake_up_process(ca->alloc_thread);
-
-	closure_put(&ca->set->cl);
-}
-
-static void discard_endio(struct bio *bio, int error)
-{
-	struct discard *d = container_of(bio, struct discard, bio);
-	schedule_work(&d->work);
-}
-
-static void do_discard(struct cache *ca, long bucket)
-{
-	struct discard *d = list_first_entry(&ca->discards,
-					     struct discard, list);
-
-	list_del(&d->list);
-	d->bucket = bucket;
-
-	atomic_inc(&ca->discards_in_flight);
-	closure_get(&ca->set->cl);
-
-	bio_init(&d->bio);
-
-	d->bio.bi_sector	= bucket_to_sector(ca->set, d->bucket);
-	d->bio.bi_bdev		= ca->bdev;
-	d->bio.bi_rw		= REQ_WRITE|REQ_DISCARD;
-	d->bio.bi_max_vecs	= 1;
-	d->bio.bi_io_vec	= d->bio.bi_inline_vecs;
-	d->bio.bi_size		= bucket_bytes(ca);
-	d->bio.bi_end_io	= discard_endio;
-	bio_set_prio(&d->bio, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0));
-
-	submit_bio(0, &d->bio);
-}
-
 /* Allocation */
 
 static inline bool can_inc_bucket_gen(struct bucket *b)
@@ -398,16 +328,15 @@ static int bch_allocator_thread(void *arg)
 			else
 				break;
 
-			allocator_wait(ca, (int) fifo_free(&ca->free) >
-				       atomic_read(&ca->discards_in_flight));
+			if (ca->discard)
+				blkdev_issue_discard(ca->bdev,
+					bucket_to_sector(ca->set, bucket),
+					ca->sb.block_size, GFP_KERNEL, 0);
 
-			if (ca->discard) {
-				allocator_wait(ca, !list_empty(&ca->discards));
-				do_discard(ca, bucket);
-			} else {
-				fifo_push(&ca->free, bucket);
-				closure_wake_up(&ca->set->bucket_wait);
-			}
+			allocator_wait(ca, !fifo_full(&ca->free));
+
+			fifo_push(&ca->free, bucket);
+			closure_wake_up(&ca->set->bucket_wait);
 		}
 
 		/*
@@ -556,22 +485,8 @@ int bch_cache_allocator_start(struct cache *ca)
 	return 0;
 }
 
-void bch_cache_allocator_exit(struct cache *ca)
-{
-	struct discard *d;
-
-	while (!list_empty(&ca->discards)) {
-		d = list_first_entry(&ca->discards, struct discard, list);
-		cancel_work_sync(&d->work);
-		list_del(&d->list);
-		kfree(d);
-	}
-}
-
 int bch_cache_allocator_init(struct cache *ca)
 {
-	unsigned i;
-
 	/*
 	 * Reserve:
 	 * Prio/gen writes first
@@ -589,15 +504,5 @@ int bch_cache_allocator_init(struct cache *ca)
 	ca->watermark[WATERMARK_NONE] = ca->free.size / 2 +
 		ca->watermark[WATERMARK_MOVINGGC];
 
-	for (i = 0; i < MAX_IN_FLIGHT_DISCARDS; i++) {
-		struct discard *d = kzalloc(sizeof(*d), GFP_KERNEL);
-		if (!d)
-			return -ENOMEM;
-
-		d->ca = ca;
-		INIT_WORK(&d->work, discard_finish);
-		list_add(&d->list, &ca->discards);
-	}
-
 	return 0;
 }
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index b39f6f0..8ef4148 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -621,15 +621,6 @@ struct cache {
 
 	bool			discard; /* Get rid of? */
 
-	/*
-	 * We preallocate structs for issuing discards to buckets, and keep them
-	 * on this list when they're not in use; do_discard() issues discards
-	 * whenever there's work to do and is called by free_some_buckets() and
-	 * when a discard finishes.
-	 */
-	atomic_t		discards_in_flight;
-	struct list_head	discards;
-
 	struct journal_device	journal;
 
 	/* The rest of this all shows up in sysfs */
@@ -1223,7 +1214,6 @@ int bch_btree_cache_alloc(struct cache_set *);
 void bch_moving_init_cache_set(struct cache_set *);
 
 int bch_cache_allocator_start(struct cache *ca);
-void bch_cache_allocator_exit(struct cache *ca);
 int bch_cache_allocator_init(struct cache *ca);
 
 void bch_debug_exit(void);
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 547c4c5..fd37342 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1725,8 +1725,6 @@ void bch_cache_release(struct kobject *kobj)
 	if (ca->set)
 		ca->set->cache[ca->sb.nr_this_dev] = NULL;
 
-	bch_cache_allocator_exit(ca);
-
 	bio_split_pool_free(&ca->bio_split_hook);
 
 	free_pages((unsigned long) ca->disk_buckets, ilog2(bucket_pages(ca)));
@@ -1758,8 +1756,6 @@ static int cache_alloc(struct cache_sb *sb, struct cache *ca)
 	__module_get(THIS_MODULE);
 	kobject_init(&ca->kobj, &bch_cache_ktype);
 
-	INIT_LIST_HEAD(&ca->discards);
-
 	bio_init(&ca->journal.bio);
 	ca->journal.bio.bi_max_vecs = 8;
 	ca->journal.bio.bi_io_vec = ca->journal.bio.bi_inline_vecs;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: bcache hangs on writes, recovers after disabling discard on cache device
  2013-07-12  1:15       ` Kent Overstreet
@ 2013-07-12 16:13         ` Juha Aatrokoski
       [not found]           ` <alpine.DEB.2.02.1307121902540.10602-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Juha Aatrokoski @ 2013-07-12 16:13 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: Heiko Wundram, linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Thu, 11 Jul 2013, Kent Overstreet wrote:

> On Thu, Jul 11, 2013 at 05:58:17PM +0300, Juha Aatrokoski wrote:
>> On Mon, 24 Jun 2013, Heiko Wundram wrote:
>>
>>> In a rather old thread of mine (514C4FC1.6090804-EqIAFqbRPK3NLxjTenLetw@public.gmane.org), I
>>> already stated a problem I was seeing back at the time, namely
>>> that bcache seems to hang on writes to the SSD device after some
>>> time, writing 50 MB/s continuously to the output device.
>> ...
>>> Disabling discard on the SSD device (by echo 0 >
>>> /sys/fs/bcache/.../cache0/discard) makes the bcache device become
>>> responsive again.
>>
>> I can confirm this. On Monday I booted into kernel 3.8.13-gentoo
>> with the current "bcache" branch patched onto it. Discard was on for
>> the cache device, and today this 50MB/s hang occured. Disabling
>> discard did indeed fix it.
>>
>> I had CONFIG_BCACHE_CLOSURES_DEBUG=y in the kernel, and the contents
>> of /sys/kernel/debug/bcache/bcache-UUID and
>> /sys/kernel/debug/closures both before and after disabling discard
>> can be found at the address below, in case they can help in
>> determining where exactly the hang happens.
>
> Can you give this patch a try? It's on top of the current
> bcache-for-3.11 branch

OK, now running the same kernel with this patch applied and discard 
enabled. However, it has previously taken my system 2-4 days to trigger 
this bug, so I'd say at least two weeks before I can say the patch (may 
have) fixed the issue.

Heiko, can your setup reproduce this bug consistently in a shorter time? 
If so, you could also try this patch for faster feedback.

^ permalink raw reply	[flat|nested] 11+ messages in thread

[parent not found: <alpine.DEB.2.02.1307121902540.10602-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>]

* Re: bcache hangs on writes, recovers after disabling discard on cache device
       [not found]           ` <alpine.DEB.2.02.1307121902540.10602-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
@ 2013-07-16 18:14             ` Juha Aatrokoski
       [not found]               ` <alpine.DEB.2.02.1307162111590.3617-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Juha Aatrokoski @ 2013-07-16 18:14 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: Heiko Wundram, linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Fri, 12 Jul 2013, Juha Aatrokoski wrote:

> On Thu, 11 Jul 2013, Kent Overstreet wrote:
>
>> On Thu, Jul 11, 2013 at 05:58:17PM +0300, Juha Aatrokoski wrote:
>>> On Mon, 24 Jun 2013, Heiko Wundram wrote:
>>> 
>>>> In a rather old thread of mine (514C4FC1.6090804-EqIAFqbRPK3NLxjTenLetw@public.gmane.org), I
>>>> already stated a problem I was seeing back at the time, namely
>>>> that bcache seems to hang on writes to the SSD device after some
>>>> time, writing 50 MB/s continuously to the output device.
>>> ...
>>>> Disabling discard on the SSD device (by echo 0 >
>>>> /sys/fs/bcache/.../cache0/discard) makes the bcache device become
>>>> responsive again.
>>> 
>>> I can confirm this. On Monday I booted into kernel 3.8.13-gentoo
>>> with the current "bcache" branch patched onto it. Discard was on for
>>> the cache device, and today this 50MB/s hang occured. Disabling
>>> discard did indeed fix it.
>>> 
>>> I had CONFIG_BCACHE_CLOSURES_DEBUG=y in the kernel, and the contents
>>> of /sys/kernel/debug/bcache/bcache-UUID and
>>> /sys/kernel/debug/closures both before and after disabling discard
>>> can be found at the address below, in case they can help in
>>> determining where exactly the hang happens.
>> 
>> Can you give this patch a try? It's on top of the current
>> bcache-for-3.11 branch
>
> OK, now running the same kernel with this patch applied and discard enabled. 
> However, it has previously taken my system 2-4 days to trigger this bug, so 
> I'd say at least two weeks before I can say the patch (may have) fixed the 
> issue.

No such luck, hit the bug after four days of uptime. Disabling discard 
fixed the problem so at least it's not any worse than before.

^ permalink raw reply	[flat|nested] 11+ messages in thread

[parent not found: <alpine.DEB.2.02.1307162111590.3617-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>]

* Re: bcache hangs on writes, recovers after disabling discard on cache device
       [not found]               ` <alpine.DEB.2.02.1307162111590.3617-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
@ 2013-07-16 21:05                 ` Kent Overstreet
  2013-07-18 12:05                   ` Juha Aatrokoski
  0 siblings, 1 reply; 11+ messages in thread
From: Kent Overstreet @ 2013-07-16 21:05 UTC (permalink / raw)
  To: Juha Aatrokoski; +Cc: Heiko Wundram, linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Tue, Jul 16, 2013 at 09:14:09PM +0300, Juha Aatrokoski wrote:
> On Fri, 12 Jul 2013, Juha Aatrokoski wrote:
> >>Can you give this patch a try? It's on top of the current
> >>bcache-for-3.11 branch
> >
> >OK, now running the same kernel with this patch applied and
> >discard enabled. However, it has previously taken my system 2-4
> >days to trigger this bug, so I'd say at least two weeks before I
> >can say the patch (may have) fixed the issue.
> 
> No such luck, hit the bug after four days of uptime. Disabling
> discard fixed the problem so at least it's not any worse than
> before.

Argh, damn peculiar bug... and the fact that it takes so long to trigger
is frustrating. I'm honestly at a loss at this point as to what that IO
actually is.

I should've thought of this sooner, but if there's any way you could
capture some blktrace output while it's doing its spinning thing that
could be very helpful.

I'll think about it some more...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bcache hangs on writes, recovers after disabling discard on cache device
  2013-07-16 21:05                 ` Kent Overstreet
@ 2013-07-18 12:05                   ` Juha Aatrokoski
       [not found]                     ` <alpine.DEB.2.02.1307181458180.18577-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Juha Aatrokoski @ 2013-07-18 12:05 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Tue, 16 Jul 2013, Kent Overstreet wrote:

> On Tue, Jul 16, 2013 at 09:14:09PM +0300, Juha Aatrokoski wrote:
>> On Fri, 12 Jul 2013, Juha Aatrokoski wrote:
>>>> Can you give this patch a try? It's on top of the current
>>>> bcache-for-3.11 branch
>>>
>>> OK, now running the same kernel with this patch applied and
>>> discard enabled. However, it has previously taken my system 2-4
>>> days to trigger this bug, so I'd say at least two weeks before I
>>> can say the patch (may have) fixed the issue.
>>
>> No such luck, hit the bug after four days of uptime. Disabling
>> discard fixed the problem so at least it's not any worse than
>> before.
>
> Argh, damn peculiar bug... and the fact that it takes so long to trigger
> is frustrating. I'm honestly at a loss at this point as to what that IO
> actually is.

One thing I noticed is that your patch only affects the allocator, the 
journal still does discards the old way. Perhaps it's worth a try to 
apply a similar change to the journal discards?

> I should've thought of this sooner, but if there's any way you could
> capture some blktrace output while it's doing its spinning thing that
> could be very helpful.

OK, I will attempt to reproduce the bug again after the weekend and I'll 
try to remember to use blktrace.

^ permalink raw reply	[flat|nested] 11+ messages in thread

[parent not found: <alpine.DEB.2.02.1307181458180.18577-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>]

* Re: bcache hangs on writes, recovers after disabling discard on cache device
       [not found]                     ` <alpine.DEB.2.02.1307181458180.18577-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
@ 2013-07-18 17:53                       ` Kent Overstreet
  2013-07-18 21:02                         ` Juha Aatrokoski
  2013-07-26 18:44                         ` Juha Aatrokoski
  0 siblings, 2 replies; 11+ messages in thread
From: Kent Overstreet @ 2013-07-18 17:53 UTC (permalink / raw)
  To: Juha Aatrokoski; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Thu, Jul 18, 2013 at 03:05:49PM +0300, Juha Aatrokoski wrote:
> On Tue, 16 Jul 2013, Kent Overstreet wrote:
> 
> >On Tue, Jul 16, 2013 at 09:14:09PM +0300, Juha Aatrokoski wrote:
> >>On Fri, 12 Jul 2013, Juha Aatrokoski wrote:
> >>>>Can you give this patch a try? It's on top of the current
> >>>>bcache-for-3.11 branch
> >>>
> >>>OK, now running the same kernel with this patch applied and
> >>>discard enabled. However, it has previously taken my system 2-4
> >>>days to trigger this bug, so I'd say at least two weeks before I
> >>>can say the patch (may have) fixed the issue.
> >>
> >>No such luck, hit the bug after four days of uptime. Disabling
> >>discard fixed the problem so at least it's not any worse than
> >>before.
> >
> >Argh, damn peculiar bug... and the fact that it takes so long to trigger
> >is frustrating. I'm honestly at a loss at this point as to what that IO
> >actually is.
> 
> One thing I noticed is that your patch only affects the allocator,
> the journal still does discards the old way. Perhaps it's worth a
> try to apply a similar change to the journal discards?

Oh man, thanks for pointing me at that code. This looks like a brown
paper bag bug...

Try this patch and tell me what happens:

From 72c531ee46e73a63739aa3fd10130f167d6bd30d Mon Sep 17 00:00:00 2001
From: Kent Overstreet <kmo-PEzghdH756F8UrSeD/g0lQ@public.gmane.org>
Date: Thu, 18 Jul 2013 10:50:55 -0700
Subject: [PATCH] Fix a dumb journal discard bug


diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index ba95ab8..c0017ca 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -428,7 +428,7 @@ static void do_journal_discard(struct cache *ca)
 		return;
 	}
 
-	switch (atomic_read(&ja->discard_in_flight) == DISCARD_IN_FLIGHT) {
+	switch (atomic_read(&ja->discard_in_flight)) {
 	case DISCARD_IN_FLIGHT:
 		return;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: bcache hangs on writes, recovers after disabling discard on cache device
  2013-07-18 17:53                       ` Kent Overstreet
@ 2013-07-18 21:02                         ` Juha Aatrokoski
  2013-07-26 18:44                         ` Juha Aatrokoski
  1 sibling, 0 replies; 11+ messages in thread
From: Juha Aatrokoski @ 2013-07-18 21:02 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Thu, 18 Jul 2013, Kent Overstreet wrote:

> On Thu, Jul 18, 2013 at 03:05:49PM +0300, Juha Aatrokoski wrote:
>> On Tue, 16 Jul 2013, Kent Overstreet wrote:
>>
>>> On Tue, Jul 16, 2013 at 09:14:09PM +0300, Juha Aatrokoski wrote:
>>>> On Fri, 12 Jul 2013, Juha Aatrokoski wrote:
>>>>>> Can you give this patch a try? It's on top of the current
>>>>>> bcache-for-3.11 branch
>>>>>
>>>>> OK, now running the same kernel with this patch applied and
>>>>> discard enabled. However, it has previously taken my system 2-4
>>>>> days to trigger this bug, so I'd say at least two weeks before I
>>>>> can say the patch (may have) fixed the issue.
>>>>
>>>> No such luck, hit the bug after four days of uptime. Disabling
>>>> discard fixed the problem so at least it's not any worse than
>>>> before.
>>>
>>> Argh, damn peculiar bug... and the fact that it takes so long to trigger
>>> is frustrating. I'm honestly at a loss at this point as to what that IO
>>> actually is.
>>
>> One thing I noticed is that your patch only affects the allocator,
>> the journal still does discards the old way. Perhaps it's worth a
>> try to apply a similar change to the journal discards?
>
> Oh man, thanks for pointing me at that code. This looks like a brown
> paper bag bug...
>
> Try this patch and tell me what happens:

Yeah, looks like a very probable culprit for this bug. If I read this 
correctly, the bug is triggered the first time do_journal_discard() is 
called, which results in an infinite discard loop (the switch statement 
alternates between the DISCARD_IN_FLIGHT and DISCARD_READY branches with 
DISCARD_DONE never reached, and do_journal_discard() is called repeatedly 
as it does not seem to accomplish the requested discards), which explains 
the observed 50MB/s write activity.

Now, assuming (with very good reason) that this is the cause of the bug, 
is there something I can do (on a file system on top of the bcache dev) to 
trigger it faster than in 2-4 days? My guess is that this happens when the 
journal gets full/wraps around the first time, but I don't know if the 
journal size is fixed or dynamic, I saw nothing regarding journal size in 
/sys/fs/bcache. My cache dev is 80G with 512k bucket size and 4k block 
size. Will a simple loop like this work: "while true; do cp 200MB_file 
tmpfile; sync; rm tmpfile; sync; done"

BTW, are there performance or other gains to be had by doing the discards 
"manually" by submitting bios? As evidenced by the other patch, the code 
would be much simpler if blkdev_issue_discard() was used instead.

> From 72c531ee46e73a63739aa3fd10130f167d6bd30d Mon Sep 17 00:00:00 2001
> From: Kent Overstreet <kmo-PEzghdH756F8UrSeD/g0lQ@public.gmane.org>
> Date: Thu, 18 Jul 2013 10:50:55 -0700
> Subject: [PATCH] Fix a dumb journal discard bug
>
>
> diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
> index ba95ab8..c0017ca 100644
> --- a/drivers/md/bcache/journal.c
> +++ b/drivers/md/bcache/journal.c
> @@ -428,7 +428,7 @@ static void do_journal_discard(struct cache *ca)
> 		return;
> 	}
>
> -	switch (atomic_read(&ja->discard_in_flight) == DISCARD_IN_FLIGHT) {
> +	switch (atomic_read(&ja->discard_in_flight)) {
> 	case DISCARD_IN_FLIGHT:
> 		return;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bcache hangs on writes, recovers after disabling discard on cache device
  2013-07-18 17:53                       ` Kent Overstreet
  2013-07-18 21:02                         ` Juha Aatrokoski
@ 2013-07-26 18:44                         ` Juha Aatrokoski
       [not found]                           ` <alpine.DEB.2.02.1307262143200.22718-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
  1 sibling, 1 reply; 11+ messages in thread
From: Juha Aatrokoski @ 2013-07-26 18:44 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Thu, 18 Jul 2013, Kent Overstreet wrote:

> On Thu, Jul 18, 2013 at 03:05:49PM +0300, Juha Aatrokoski wrote:
>> On Tue, 16 Jul 2013, Kent Overstreet wrote:
>>
>>> On Tue, Jul 16, 2013 at 09:14:09PM +0300, Juha Aatrokoski wrote:
>>>> On Fri, 12 Jul 2013, Juha Aatrokoski wrote:
>>>>>> Can you give this patch a try? It's on top of the current
>>>>>> bcache-for-3.11 branch
>>>>>
>>>>> OK, now running the same kernel with this patch applied and
>>>>> discard enabled. However, it has previously taken my system 2-4
>>>>> days to trigger this bug, so I'd say at least two weeks before I
>>>>> can say the patch (may have) fixed the issue.
>>>>
>>>> No such luck, hit the bug after four days of uptime. Disabling
>>>> discard fixed the problem so at least it's not any worse than
>>>> before.
>>>
>>> Argh, damn peculiar bug... and the fact that it takes so long to trigger
>>> is frustrating. I'm honestly at a loss at this point as to what that IO
>>> actually is.
>>
>> One thing I noticed is that your patch only affects the allocator,
>> the journal still does discards the old way. Perhaps it's worth a
>> try to apply a similar change to the journal discards?
>
> Oh man, thanks for pointing me at that code. This looks like a brown
> paper bag bug...
>
> Try this patch and tell me what happens:

Uptime now seven days without encountering the bug. Previously it didn't 
take more than four days to trigger, so looking good (so far).


> From 72c531ee46e73a63739aa3fd10130f167d6bd30d Mon Sep 17 00:00:00 2001
> From: Kent Overstreet <kmo-PEzghdH756F8UrSeD/g0lQ@public.gmane.org>
> Date: Thu, 18 Jul 2013 10:50:55 -0700
> Subject: [PATCH] Fix a dumb journal discard bug
>
>
> diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
> index ba95ab8..c0017ca 100644
> --- a/drivers/md/bcache/journal.c
> +++ b/drivers/md/bcache/journal.c
> @@ -428,7 +428,7 @@ static void do_journal_discard(struct cache *ca)
> 		return;
> 	}
>
> -	switch (atomic_read(&ja->discard_in_flight) == DISCARD_IN_FLIGHT) {
> +	switch (atomic_read(&ja->discard_in_flight)) {
> 	case DISCARD_IN_FLIGHT:
> 		return;

^ permalink raw reply	[flat|nested] 11+ messages in thread

[parent not found: <alpine.DEB.2.02.1307262143200.22718-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>]

* Re: bcache hangs on writes, recovers after disabling discard on cache device
       [not found]                           ` <alpine.DEB.2.02.1307262143200.22718-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
@ 2013-07-26 20:13                             ` Kent Overstreet
  0 siblings, 0 replies; 11+ messages in thread
From: Kent Overstreet @ 2013-07-26 20:13 UTC (permalink / raw)
  To: Juha Aatrokoski; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Fri, Jul 26, 2013 at 09:44:32PM +0300, Juha Aatrokoski wrote:
> On Thu, 18 Jul 2013, Kent Overstreet wrote:
> >Oh man, thanks for pointing me at that code. This looks like a brown
> >paper bag bug...
> >
> >Try this patch and tell me what happens:
> 
> Uptime now seven days without encountering the bug. Previously it
> didn't take more than four days to trigger, so looking good (so
> far).

Great! I'll send the fix out for 3.11 and 3.10.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-07-26 20:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-24 18:36 bcache hangs on writes, recovers after disabling discard on cache device Heiko Wundram
     [not found] ` <51C891C9.4060809-EqIAFqbRPK3NLxjTenLetw@public.gmane.org>
2013-07-11 14:58   ` Juha Aatrokoski
     [not found]     ` <alpine.DEB.2.02.1307111748130.26927-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
2013-07-12  1:15       ` Kent Overstreet
2013-07-12 16:13         ` Juha Aatrokoski
     [not found]           ` <alpine.DEB.2.02.1307121902540.10602-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
2013-07-16 18:14             ` Juha Aatrokoski
     [not found]               ` <alpine.DEB.2.02.1307162111590.3617-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
2013-07-16 21:05                 ` Kent Overstreet
2013-07-18 12:05                   ` Juha Aatrokoski
     [not found]                     ` <alpine.DEB.2.02.1307181458180.18577-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
2013-07-18 17:53                       ` Kent Overstreet
2013-07-18 21:02                         ` Juha Aatrokoski
2013-07-26 18:44                         ` Juha Aatrokoski
     [not found]                           ` <alpine.DEB.2.02.1307262143200.22718-pfGh9vBu50j/PtFMR13I2A@public.gmane.org>
2013-07-26 20:13                             ` Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox