* [PATCH] bcache: avoid redundant access RB tree in read_dirty
@ 2025-10-07 9:02 colyli
2025-10-10 16:13 ` kernel test robot
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: colyli @ 2025-10-07 9:02 UTC (permalink / raw)
To: linux-bcache; +Cc: Coly Li, Zhou Jifeng
From: Coly Li <colyli@fnnas.com>
In bcache writeback procedure, scanned dirty keys are stored in a
red-black tree, it is dc->writeback_keys.freelist, and can be indexed by
dc->writeback_keys.keys. Inside red_dirty() one dirty key is fetched by
calling bch_keybuf_next() in each while-loop until one of the following
situations meets,
- The fetched keys number reaches MAX_WRTEBACKS_IN_PASS.
- The total size of the fetched keys reaches MAX_WRITESIZE_IN_PASS.
- All nodes in the red-black tree are iterated.
The above process is reasonable, but calling bch_keybuf_next() in each
while-loop is inefficient. Let me explain why. Let's see its code,
2767 struct keybuf_key *bch_keybuf_next(struct keybuf *buf)
2768 {
2769 struct keybuf_key *w;
2770
2771 spin_lock(&buf->lock);
2772
2773 w = RB_FIRST(&buf->keys, struct keybuf_key, node);
2774
2775 while (w && w->private)
2776 w = RB_NEXT(w, node);
2777
2778 if (w)
2779 w->private = ERR_PTR(-EINTR);
2780
2781 spin_unlock(&buf->lock);
2782 return w;
2783 }
Every time when bch_keybuf_next() is called, the red-black tree is
iterated from start until a node doesn't have w->private value. Then
this node's w->private is set by ERR_PTR(-EINTR) and pointer of the
node's holder (struct keybuf_key) is returned back to read_dirty().
In worst case, if there are 500 nodes in this red-black tree, nodes to
be iterated in each calling of bch_keybuf_next() in read_dirty() will
be: 1, 2, 3, 4, ......, 498, 499, 500. The total nodes iteration times
are 125250. If all the nodes can be fetched from read_dirty() once, the
iteration times are only 500, it is about 0.3% of the original number.
This patch adds new member dump_keys[KEYBUF_NR], it is used to store
one-time fetched dirty keys from the red-black trees. All the keys in
dump_keys[] array is in their original iterated order. A new function
bch_keybuf_dump() is used to fetch all ndoes from the red-black tree and
store the selected keys in dump_keys[]. In side read_dirty(), this new
function is called before entering while-loop, and bch_keybuf_next() is
not used anymore. Now the red-black tree only be iterated once, also the
spinlock buf->lock is only acquired once.
For the situation that not all nodes inside dump_keys[] are handled, the
non-handled nodes should be fetched next time when read_dirty() is
called again. So inside the red-black tree, w->private is not set by
ERR_PTR(-EINTR) when the node is fetched into dump_keys[]. It is set in
write_dirty_finish(), that means this node is marked as handled when the
dirty data of the key is written backed to backing device. This is a
change should be noticed.
Inside read_dirty() after closure_sync() returned, it means all write-
back bios are completed, then patch will check whether all fetched nodes
are written back. Because items in dump_keys[] array are exactly equal
to nodes in the red-black tree, if all the keys in dump_keys[] are
handled, the red-black tree is reset directly by array_allocator_init().
Only when unexpected condition happens and there are some items inside
dump_keys[] array not handled, bch_keybuf_del() is called to delete the
written dirty keys one by one from the red-black tree.
In practice, testing shows most of the time all items in dump_keys[] are
all handled promptly, the red-black tree is directly reset.
Answers to some potential concerns:
- Memory barrier issue for dc->writeback_keys.hangled.
This atomic_t value is increased inside semaphore dc->in_flight in
write_dirty_finish(). Calling up(&dc->in_flight) has implicit memory
barrier inside, no need to redundant explicit memory barrier call.
- Concurrent access to the red-black tree in dc->writeback_keys.freelist
Access to dc->writeback_keys.freelist is guarded by
dc->writeback_keys.lock, it is safe for concurrent access on it.
Indeed before returning from read_dirty(), no one else will access
this red-black tree (no matter adding new nodes or setting w->private)
it is also safe to check w->private inside read_dirty() without
holding dc->writeback_keys.lock after closure_sync() returned.
The idea was inspired by code review comments of previous patches from
Zhou Jifeng.
Signed-off-by: Coly Li <colyli@fnnas.com>
Cc: Zhou Jifeng <zhoujifeng@kylinos.com.cn>
---
drivers/md/bcache/bcache.h | 2 ++
drivers/md/bcache/btree.c | 28 ++++++++++++++++++++++
drivers/md/bcache/btree.h | 2 ++
drivers/md/bcache/writeback.c | 45 +++++++++++++++++++++++++----------
4 files changed, 65 insertions(+), 12 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 8ccacba85547..06880e2d5b1d 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -240,11 +240,13 @@ struct keybuf {
*/
struct bkey start;
struct bkey end;
+ atomic_t handled;
struct rb_root keys;
#define KEYBUF_NR 500
DECLARE_ARRAY_ALLOCATOR(struct keybuf_key, freelist, KEYBUF_NR);
+ struct keybuf_key *dump_keys[KEYBUF_NR];
};
struct bcache_device {
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index bdb90833bff0..e107d0f05ee4 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -2763,6 +2763,34 @@ bool bch_keybuf_check_overlapping(struct keybuf *buf, struct bkey *start,
return ret;
}
+int bch_keybuf_dump(struct keybuf *buf, struct keybuf_key *dump_list[],
+ int dump_list_len)
+{
+ struct keybuf_key *w;
+ int i = 0;
+
+ memset(dump_list, 0, dump_list_len * sizeof(struct keybuf_key *));
+
+ spin_lock(&buf->lock);
+
+ w = RB_FIRST(&buf->keys, struct keybuf_key, node);
+
+ while (w && i < dump_list_len) {
+ if (w->private) {
+ w = RB_NEXT(w, node);
+ continue;
+ }
+
+ dump_list[i++] = w;
+ w->private = ERR_PTR(-EINTR);
+ w = RB_NEXT(w, node);
+ }
+
+ spin_unlock(&buf->lock);
+
+ return i;
+}
+
struct keybuf_key *bch_keybuf_next(struct keybuf *buf)
{
struct keybuf_key *w;
diff --git a/drivers/md/bcache/btree.h b/drivers/md/bcache/btree.h
index 45d64b54115a..65d81a7b5ac6 100644
--- a/drivers/md/bcache/btree.h
+++ b/drivers/md/bcache/btree.h
@@ -413,5 +413,7 @@ struct keybuf_key *bch_keybuf_next_rescan(struct cache_set *c,
struct keybuf *buf,
struct bkey *end,
keybuf_pred_fn *pred);
+int bch_keybuf_dump(struct keybuf *buf, struct keybuf_key *dump_list[],
+ int dump_list_len);
void bch_update_bucket_in_use(struct cache_set *c, struct gc_stat *stats);
#endif
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 59cd0c3f8ce9..e6c548e83ff1 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -381,7 +381,8 @@ static CLOSURE_CALLBACK(write_dirty_finish)
: &dc->disk.c->writeback_keys_done);
}
- bch_keybuf_del(&dc->writeback_keys, w);
+ w->private = ERR_PTR(-EINTR);
+ atomic_inc(&dc->writeback_keys.handled);
up(&dc->in_flight);
closure_return_with_destructor(cl, dirty_io_destructor);
@@ -474,8 +475,10 @@ static CLOSURE_CALLBACK(read_dirty_submit)
static void read_dirty(struct cached_dev *dc)
{
unsigned int delay = 0;
- struct keybuf_key *next, *keys[MAX_WRITEBACKS_IN_PASS], *w;
+ struct keybuf_key *keys[MAX_WRITEBACKS_IN_PASS], *w;
+ struct keybuf_key **dump_keys;
size_t size;
+ int checked, dump_nr;
int nk, i;
struct dirty_io *io;
struct closure cl;
@@ -489,17 +492,22 @@ static void read_dirty(struct cached_dev *dc)
* XXX: if we error, background writeback just spins. Should use some
* mempools.
*/
-
- next = bch_keybuf_next(&dc->writeback_keys);
+ dump_nr = bch_keybuf_dump(&dc->writeback_keys,
+ dc->writeback_keys.dump_keys,
+ ARRAY_SIZE(dc->writeback_keys.dump_keys));
+ dump_keys = dc->writeback_keys.dump_keys;
+ atomic_set(&dc->writeback_keys.handled, 0);
+ checked = 0;
while (!kthread_should_stop() &&
!test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
- next) {
+ (checked < dump_nr)) {
size = 0;
nk = 0;
do {
- BUG_ON(ptr_stale(dc->disk.c, &next->key, 0));
+ w = dump_keys[checked];
+ BUG_ON(ptr_stale(dc->disk.c, &w->key, 0));
/*
* Don't combine too many operations, even if they
@@ -525,12 +533,12 @@ static void read_dirty(struct cached_dev *dc)
* command queueing.
*/
if ((nk != 0) && bkey_cmp(&keys[nk-1]->key,
- &START_KEY(&next->key)))
+ &START_KEY(&w->key)))
break;
- size += KEY_SIZE(&next->key);
- keys[nk++] = next;
- } while ((next = bch_keybuf_next(&dc->writeback_keys)));
+ size += KEY_SIZE(&w->key);
+ keys[nk++] = w;
+ } while (++checked < dump_nr);
/* Now we have gathered a set of 1..5 keys to write back. */
for (i = 0; i < nk; i++) {
@@ -581,7 +589,6 @@ static void read_dirty(struct cached_dev *dc)
err_free:
kfree(w->private);
err:
- bch_keybuf_del(&dc->writeback_keys, w);
}
/*
@@ -589,6 +596,21 @@ static void read_dirty(struct cached_dev *dc)
* freed) before refilling again
*/
closure_sync(&cl);
+
+ if (atomic_read(&dc->writeback_keys.handled) == dump_nr) {
+ spin_lock(&dc->writeback_keys.lock);
+ dc->writeback_keys.keys = RB_ROOT;
+ array_allocator_init(&dc->writeback_keys.freelist);
+ spin_unlock(&dc->writeback_keys.lock);
+ } else {
+ for (i = 0; i < dump_nr; i++) {
+ w = dump_keys[i];
+ if (!w->private)
+ continue;
+ bch_keybuf_del(&dc->writeback_keys, w);
+ }
+ }
+ atomic_set(&dc->writeback_keys.handled, 0);
}
/* Scan for dirty data */
@@ -820,7 +842,6 @@ static int bch_writeback_thread(void *arg)
if (searched_full_index) {
unsigned int delay = dc->writeback_delay * HZ;
-
while (delay &&
!kthread_should_stop() &&
!test_bit(CACHE_SET_IO_DISABLE, &c->flags) &&
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] bcache: avoid redundant access RB tree in read_dirty
2025-10-07 9:02 [PATCH] bcache: avoid redundant access RB tree in read_dirty colyli
@ 2025-10-10 16:13 ` kernel test robot
2025-10-10 16:56 ` kernel test robot
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2025-10-10 16:13 UTC (permalink / raw)
To: colyli, linux-bcache; +Cc: llvm, oe-kbuild-all, Coly Li, Zhou Jifeng
Hi,
kernel test robot noticed the following build warnings:
[auto build test WARNING on linus/master]
[also build test WARNING on v6.17 next-20251010]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/colyli-fnnas-com/bcache-avoid-redundant-access-RB-tree-in-read_dirty/20251010-103843
base: linus/master
patch link: https://lore.kernel.org/r/20251007090232.30386-1-colyli%40fnnas.com
patch subject: [PATCH] bcache: avoid redundant access RB tree in read_dirty
config: x86_64-buildonly-randconfig-001-20251010 (https://download.01.org/0day-ci/archive/20251011/202510110048.PdCCuB3b-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251011/202510110048.PdCCuB3b-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510110048.PdCCuB3b-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/md/bcache/writeback.c:592:2: warning: label at end of compound statement is a C23 extension [-Wc23-extensions]
592 | }
| ^
1 warning generated.
vim +592 drivers/md/bcache/writeback.c
cafe563591446c Kent Overstreet 2013-03-23 474
5e6926daac267d Kent Overstreet 2013-07-24 475 static void read_dirty(struct cached_dev *dc)
cafe563591446c Kent Overstreet 2013-03-23 476 {
6f10f7d1b02b1b Coly Li 2018-08-11 477 unsigned int delay = 0;
448f60e858d871 Coly Li 2025-10-07 478 struct keybuf_key *keys[MAX_WRITEBACKS_IN_PASS], *w;
448f60e858d871 Coly Li 2025-10-07 479 struct keybuf_key **dump_keys;
539d39eb270834 Tang Junhui 2018-01-08 480 size_t size;
448f60e858d871 Coly Li 2025-10-07 481 int checked, dump_nr;
539d39eb270834 Tang Junhui 2018-01-08 482 int nk, i;
cafe563591446c Kent Overstreet 2013-03-23 483 struct dirty_io *io;
5e6926daac267d Kent Overstreet 2013-07-24 484 struct closure cl;
6e6ccc67b9c7a6 Michael Lyle 2018-01-08 485 uint16_t sequence = 0;
5e6926daac267d Kent Overstreet 2013-07-24 486
6e6ccc67b9c7a6 Michael Lyle 2018-01-08 487 BUG_ON(!llist_empty(&dc->writeback_ordering_wait.list));
6e6ccc67b9c7a6 Michael Lyle 2018-01-08 488 atomic_set(&dc->writeback_sequence_next, sequence);
5e6926daac267d Kent Overstreet 2013-07-24 489 closure_init_stack(&cl);
cafe563591446c Kent Overstreet 2013-03-23 490
cafe563591446c Kent Overstreet 2013-03-23 491 /*
cafe563591446c Kent Overstreet 2013-03-23 492 * XXX: if we error, background writeback just spins. Should use some
cafe563591446c Kent Overstreet 2013-03-23 493 * mempools.
cafe563591446c Kent Overstreet 2013-03-23 494 */
448f60e858d871 Coly Li 2025-10-07 495 dump_nr = bch_keybuf_dump(&dc->writeback_keys,
448f60e858d871 Coly Li 2025-10-07 496 dc->writeback_keys.dump_keys,
448f60e858d871 Coly Li 2025-10-07 497 ARRAY_SIZE(dc->writeback_keys.dump_keys));
448f60e858d871 Coly Li 2025-10-07 498 dump_keys = dc->writeback_keys.dump_keys;
448f60e858d871 Coly Li 2025-10-07 499 atomic_set(&dc->writeback_keys.handled, 0);
448f60e858d871 Coly Li 2025-10-07 500 checked = 0;
5e6926daac267d Kent Overstreet 2013-07-24 501
771f393e8ffc9b Coly Li 2018-03-18 502 while (!kthread_should_stop() &&
771f393e8ffc9b Coly Li 2018-03-18 503 !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
448f60e858d871 Coly Li 2025-10-07 504 (checked < dump_nr)) {
539d39eb270834 Tang Junhui 2018-01-08 505 size = 0;
539d39eb270834 Tang Junhui 2018-01-08 506 nk = 0;
539d39eb270834 Tang Junhui 2018-01-08 507
539d39eb270834 Tang Junhui 2018-01-08 508 do {
448f60e858d871 Coly Li 2025-10-07 509 w = dump_keys[checked];
448f60e858d871 Coly Li 2025-10-07 510 BUG_ON(ptr_stale(dc->disk.c, &w->key, 0));
539d39eb270834 Tang Junhui 2018-01-08 511
539d39eb270834 Tang Junhui 2018-01-08 512 /*
539d39eb270834 Tang Junhui 2018-01-08 513 * Don't combine too many operations, even if they
539d39eb270834 Tang Junhui 2018-01-08 514 * are all small.
539d39eb270834 Tang Junhui 2018-01-08 515 */
539d39eb270834 Tang Junhui 2018-01-08 516 if (nk >= MAX_WRITEBACKS_IN_PASS)
cafe563591446c Kent Overstreet 2013-03-23 517 break;
cafe563591446c Kent Overstreet 2013-03-23 518
539d39eb270834 Tang Junhui 2018-01-08 519 /*
539d39eb270834 Tang Junhui 2018-01-08 520 * If the current operation is very large, don't
539d39eb270834 Tang Junhui 2018-01-08 521 * further combine operations.
539d39eb270834 Tang Junhui 2018-01-08 522 */
539d39eb270834 Tang Junhui 2018-01-08 523 if (size >= MAX_WRITESIZE_IN_PASS)
539d39eb270834 Tang Junhui 2018-01-08 524 break;
cafe563591446c Kent Overstreet 2013-03-23 525
539d39eb270834 Tang Junhui 2018-01-08 526 /*
539d39eb270834 Tang Junhui 2018-01-08 527 * Operations are only eligible to be combined
539d39eb270834 Tang Junhui 2018-01-08 528 * if they are contiguous.
539d39eb270834 Tang Junhui 2018-01-08 529 *
539d39eb270834 Tang Junhui 2018-01-08 530 * TODO: add a heuristic willing to fire a
539d39eb270834 Tang Junhui 2018-01-08 531 * certain amount of non-contiguous IO per pass,
539d39eb270834 Tang Junhui 2018-01-08 532 * so that we can benefit from backing device
539d39eb270834 Tang Junhui 2018-01-08 533 * command queueing.
539d39eb270834 Tang Junhui 2018-01-08 534 */
539d39eb270834 Tang Junhui 2018-01-08 535 if ((nk != 0) && bkey_cmp(&keys[nk-1]->key,
448f60e858d871 Coly Li 2025-10-07 536 &START_KEY(&w->key)))
539d39eb270834 Tang Junhui 2018-01-08 537 break;
cafe563591446c Kent Overstreet 2013-03-23 538
448f60e858d871 Coly Li 2025-10-07 539 size += KEY_SIZE(&w->key);
448f60e858d871 Coly Li 2025-10-07 540 keys[nk++] = w;
448f60e858d871 Coly Li 2025-10-07 541 } while (++checked < dump_nr);
cafe563591446c Kent Overstreet 2013-03-23 542
539d39eb270834 Tang Junhui 2018-01-08 543 /* Now we have gathered a set of 1..5 keys to write back. */
539d39eb270834 Tang Junhui 2018-01-08 544 for (i = 0; i < nk; i++) {
539d39eb270834 Tang Junhui 2018-01-08 545 w = keys[i];
539d39eb270834 Tang Junhui 2018-01-08 546
d86eaa0f3c56da Christoph Hellwig 2025-09-08 547 io = kzalloc(sizeof(*io) + sizeof(struct bio_vec) *
d86eaa0f3c56da Christoph Hellwig 2025-09-08 548 DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS),
cafe563591446c Kent Overstreet 2013-03-23 549 GFP_KERNEL);
cafe563591446c Kent Overstreet 2013-03-23 550 if (!io)
cafe563591446c Kent Overstreet 2013-03-23 551 goto err;
cafe563591446c Kent Overstreet 2013-03-23 552
cafe563591446c Kent Overstreet 2013-03-23 553 w->private = io;
cafe563591446c Kent Overstreet 2013-03-23 554 io->dc = dc;
6e6ccc67b9c7a6 Michael Lyle 2018-01-08 555 io->sequence = sequence++;
cafe563591446c Kent Overstreet 2013-03-23 556
cafe563591446c Kent Overstreet 2013-03-23 557 dirty_init(w);
c34b7ac6508755 Christoph Hellwig 2022-12-06 558 io->bio.bi_opf = REQ_OP_READ;
4f024f3797c43c Kent Overstreet 2013-10-11 559 io->bio.bi_iter.bi_sector = PTR_OFFSET(&w->key, 0);
11e9560e6c005b Christoph Hellwig 2021-04-11 560 bio_set_dev(&io->bio, dc->disk.c->cache->bdev);
cafe563591446c Kent Overstreet 2013-03-23 561 io->bio.bi_end_io = read_dirty_endio;
cafe563591446c Kent Overstreet 2013-03-23 562
25d8be77e19224 Ming Lei 2017-12-18 563 if (bch_bio_alloc_pages(&io->bio, GFP_KERNEL))
cafe563591446c Kent Overstreet 2013-03-23 564 goto err_free;
cafe563591446c Kent Overstreet 2013-03-23 565
c37511b863f36c Kent Overstreet 2013-04-26 566 trace_bcache_writeback(&w->key);
cafe563591446c Kent Overstreet 2013-03-23 567
c2a4f3183a1248 Kent Overstreet 2013-09-23 568 down(&dc->in_flight);
539d39eb270834 Tang Junhui 2018-01-08 569
3be11dbab67a3e Coly Li 2018-08-11 570 /*
3be11dbab67a3e Coly Li 2018-08-11 571 * We've acquired a semaphore for the maximum
539d39eb270834 Tang Junhui 2018-01-08 572 * simultaneous number of writebacks; from here
539d39eb270834 Tang Junhui 2018-01-08 573 * everything happens asynchronously.
539d39eb270834 Tang Junhui 2018-01-08 574 */
5e6926daac267d Kent Overstreet 2013-07-24 575 closure_call(&io->cl, read_dirty_submit, NULL, &cl);
539d39eb270834 Tang Junhui 2018-01-08 576 }
cafe563591446c Kent Overstreet 2013-03-23 577
539d39eb270834 Tang Junhui 2018-01-08 578 delay = writeback_delay(dc, size);
539d39eb270834 Tang Junhui 2018-01-08 579
771f393e8ffc9b Coly Li 2018-03-18 580 while (!kthread_should_stop() &&
771f393e8ffc9b Coly Li 2018-03-18 581 !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
771f393e8ffc9b Coly Li 2018-03-18 582 delay) {
539d39eb270834 Tang Junhui 2018-01-08 583 schedule_timeout_interruptible(delay);
539d39eb270834 Tang Junhui 2018-01-08 584 delay = writeback_delay(dc, 0);
539d39eb270834 Tang Junhui 2018-01-08 585 }
cafe563591446c Kent Overstreet 2013-03-23 586 }
cafe563591446c Kent Overstreet 2013-03-23 587
cafe563591446c Kent Overstreet 2013-03-23 588 if (0) {
cafe563591446c Kent Overstreet 2013-03-23 589 err_free:
cafe563591446c Kent Overstreet 2013-03-23 590 kfree(w->private);
cafe563591446c Kent Overstreet 2013-03-23 591 err:
cafe563591446c Kent Overstreet 2013-03-23 @592 }
cafe563591446c Kent Overstreet 2013-03-23 593
c2a4f3183a1248 Kent Overstreet 2013-09-23 594 /*
c2a4f3183a1248 Kent Overstreet 2013-09-23 595 * Wait for outstanding writeback IOs to finish (and keybuf slots to be
c2a4f3183a1248 Kent Overstreet 2013-09-23 596 * freed) before refilling again
c2a4f3183a1248 Kent Overstreet 2013-09-23 597 */
5e6926daac267d Kent Overstreet 2013-07-24 598 closure_sync(&cl);
448f60e858d871 Coly Li 2025-10-07 599
448f60e858d871 Coly Li 2025-10-07 600 if (atomic_read(&dc->writeback_keys.handled) == dump_nr) {
448f60e858d871 Coly Li 2025-10-07 601 spin_lock(&dc->writeback_keys.lock);
448f60e858d871 Coly Li 2025-10-07 602 dc->writeback_keys.keys = RB_ROOT;
448f60e858d871 Coly Li 2025-10-07 603 array_allocator_init(&dc->writeback_keys.freelist);
448f60e858d871 Coly Li 2025-10-07 604 spin_unlock(&dc->writeback_keys.lock);
448f60e858d871 Coly Li 2025-10-07 605 } else {
448f60e858d871 Coly Li 2025-10-07 606 for (i = 0; i < dump_nr; i++) {
448f60e858d871 Coly Li 2025-10-07 607 w = dump_keys[i];
448f60e858d871 Coly Li 2025-10-07 608 if (!w->private)
448f60e858d871 Coly Li 2025-10-07 609 continue;
448f60e858d871 Coly Li 2025-10-07 610 bch_keybuf_del(&dc->writeback_keys, w);
448f60e858d871 Coly Li 2025-10-07 611 }
448f60e858d871 Coly Li 2025-10-07 612 }
448f60e858d871 Coly Li 2025-10-07 613 atomic_set(&dc->writeback_keys.handled, 0);
5e6926daac267d Kent Overstreet 2013-07-24 614 }
5e6926daac267d Kent Overstreet 2013-07-24 615
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] bcache: avoid redundant access RB tree in read_dirty
2025-10-07 9:02 [PATCH] bcache: avoid redundant access RB tree in read_dirty colyli
2025-10-10 16:13 ` kernel test robot
@ 2025-10-10 16:56 ` kernel test robot
2025-10-20 16:44 ` Pierre Juhen
[not found] ` <050fe436-e629-4428-8e4d-33edd8985767@orange.fr>
3 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2025-10-10 16:56 UTC (permalink / raw)
To: colyli, linux-bcache; +Cc: oe-kbuild-all, Coly Li, Zhou Jifeng
Hi,
kernel test robot noticed the following build errors:
[auto build test ERROR on linus/master]
[also build test ERROR on v6.17 next-20251010]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/colyli-fnnas-com/bcache-avoid-redundant-access-RB-tree-in-read_dirty/20251010-103843
base: linus/master
patch link: https://lore.kernel.org/r/20251007090232.30386-1-colyli%40fnnas.com
patch subject: [PATCH] bcache: avoid redundant access RB tree in read_dirty
config: sparc-randconfig-002-20251010 (https://download.01.org/0day-ci/archive/20251011/202510110055.zirjAQZU-lkp@intel.com/config)
compiler: sparc64-linux-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251011/202510110055.zirjAQZU-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510110055.zirjAQZU-lkp@intel.com/
All errors (new ones prefixed by >>):
drivers/md/bcache/writeback.c: In function 'read_dirty':
>> drivers/md/bcache/writeback.c:591:1: error: label at end of compound statement
err:
^~~
vim +591 drivers/md/bcache/writeback.c
cafe563591446c Kent Overstreet 2013-03-23 474
5e6926daac267d Kent Overstreet 2013-07-24 475 static void read_dirty(struct cached_dev *dc)
cafe563591446c Kent Overstreet 2013-03-23 476 {
6f10f7d1b02b1b Coly Li 2018-08-11 477 unsigned int delay = 0;
448f60e858d871 Coly Li 2025-10-07 478 struct keybuf_key *keys[MAX_WRITEBACKS_IN_PASS], *w;
448f60e858d871 Coly Li 2025-10-07 479 struct keybuf_key **dump_keys;
539d39eb270834 Tang Junhui 2018-01-08 480 size_t size;
448f60e858d871 Coly Li 2025-10-07 481 int checked, dump_nr;
539d39eb270834 Tang Junhui 2018-01-08 482 int nk, i;
cafe563591446c Kent Overstreet 2013-03-23 483 struct dirty_io *io;
5e6926daac267d Kent Overstreet 2013-07-24 484 struct closure cl;
6e6ccc67b9c7a6 Michael Lyle 2018-01-08 485 uint16_t sequence = 0;
5e6926daac267d Kent Overstreet 2013-07-24 486
6e6ccc67b9c7a6 Michael Lyle 2018-01-08 487 BUG_ON(!llist_empty(&dc->writeback_ordering_wait.list));
6e6ccc67b9c7a6 Michael Lyle 2018-01-08 488 atomic_set(&dc->writeback_sequence_next, sequence);
5e6926daac267d Kent Overstreet 2013-07-24 489 closure_init_stack(&cl);
cafe563591446c Kent Overstreet 2013-03-23 490
cafe563591446c Kent Overstreet 2013-03-23 491 /*
cafe563591446c Kent Overstreet 2013-03-23 492 * XXX: if we error, background writeback just spins. Should use some
cafe563591446c Kent Overstreet 2013-03-23 493 * mempools.
cafe563591446c Kent Overstreet 2013-03-23 494 */
448f60e858d871 Coly Li 2025-10-07 495 dump_nr = bch_keybuf_dump(&dc->writeback_keys,
448f60e858d871 Coly Li 2025-10-07 496 dc->writeback_keys.dump_keys,
448f60e858d871 Coly Li 2025-10-07 497 ARRAY_SIZE(dc->writeback_keys.dump_keys));
448f60e858d871 Coly Li 2025-10-07 498 dump_keys = dc->writeback_keys.dump_keys;
448f60e858d871 Coly Li 2025-10-07 499 atomic_set(&dc->writeback_keys.handled, 0);
448f60e858d871 Coly Li 2025-10-07 500 checked = 0;
5e6926daac267d Kent Overstreet 2013-07-24 501
771f393e8ffc9b Coly Li 2018-03-18 502 while (!kthread_should_stop() &&
771f393e8ffc9b Coly Li 2018-03-18 503 !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
448f60e858d871 Coly Li 2025-10-07 504 (checked < dump_nr)) {
539d39eb270834 Tang Junhui 2018-01-08 505 size = 0;
539d39eb270834 Tang Junhui 2018-01-08 506 nk = 0;
539d39eb270834 Tang Junhui 2018-01-08 507
539d39eb270834 Tang Junhui 2018-01-08 508 do {
448f60e858d871 Coly Li 2025-10-07 509 w = dump_keys[checked];
448f60e858d871 Coly Li 2025-10-07 510 BUG_ON(ptr_stale(dc->disk.c, &w->key, 0));
539d39eb270834 Tang Junhui 2018-01-08 511
539d39eb270834 Tang Junhui 2018-01-08 512 /*
539d39eb270834 Tang Junhui 2018-01-08 513 * Don't combine too many operations, even if they
539d39eb270834 Tang Junhui 2018-01-08 514 * are all small.
539d39eb270834 Tang Junhui 2018-01-08 515 */
539d39eb270834 Tang Junhui 2018-01-08 516 if (nk >= MAX_WRITEBACKS_IN_PASS)
cafe563591446c Kent Overstreet 2013-03-23 517 break;
cafe563591446c Kent Overstreet 2013-03-23 518
539d39eb270834 Tang Junhui 2018-01-08 519 /*
539d39eb270834 Tang Junhui 2018-01-08 520 * If the current operation is very large, don't
539d39eb270834 Tang Junhui 2018-01-08 521 * further combine operations.
539d39eb270834 Tang Junhui 2018-01-08 522 */
539d39eb270834 Tang Junhui 2018-01-08 523 if (size >= MAX_WRITESIZE_IN_PASS)
539d39eb270834 Tang Junhui 2018-01-08 524 break;
cafe563591446c Kent Overstreet 2013-03-23 525
539d39eb270834 Tang Junhui 2018-01-08 526 /*
539d39eb270834 Tang Junhui 2018-01-08 527 * Operations are only eligible to be combined
539d39eb270834 Tang Junhui 2018-01-08 528 * if they are contiguous.
539d39eb270834 Tang Junhui 2018-01-08 529 *
539d39eb270834 Tang Junhui 2018-01-08 530 * TODO: add a heuristic willing to fire a
539d39eb270834 Tang Junhui 2018-01-08 531 * certain amount of non-contiguous IO per pass,
539d39eb270834 Tang Junhui 2018-01-08 532 * so that we can benefit from backing device
539d39eb270834 Tang Junhui 2018-01-08 533 * command queueing.
539d39eb270834 Tang Junhui 2018-01-08 534 */
539d39eb270834 Tang Junhui 2018-01-08 535 if ((nk != 0) && bkey_cmp(&keys[nk-1]->key,
448f60e858d871 Coly Li 2025-10-07 536 &START_KEY(&w->key)))
539d39eb270834 Tang Junhui 2018-01-08 537 break;
cafe563591446c Kent Overstreet 2013-03-23 538
448f60e858d871 Coly Li 2025-10-07 539 size += KEY_SIZE(&w->key);
448f60e858d871 Coly Li 2025-10-07 540 keys[nk++] = w;
448f60e858d871 Coly Li 2025-10-07 541 } while (++checked < dump_nr);
cafe563591446c Kent Overstreet 2013-03-23 542
539d39eb270834 Tang Junhui 2018-01-08 543 /* Now we have gathered a set of 1..5 keys to write back. */
539d39eb270834 Tang Junhui 2018-01-08 544 for (i = 0; i < nk; i++) {
539d39eb270834 Tang Junhui 2018-01-08 545 w = keys[i];
539d39eb270834 Tang Junhui 2018-01-08 546
d86eaa0f3c56da Christoph Hellwig 2025-09-08 547 io = kzalloc(sizeof(*io) + sizeof(struct bio_vec) *
d86eaa0f3c56da Christoph Hellwig 2025-09-08 548 DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS),
cafe563591446c Kent Overstreet 2013-03-23 549 GFP_KERNEL);
cafe563591446c Kent Overstreet 2013-03-23 550 if (!io)
cafe563591446c Kent Overstreet 2013-03-23 551 goto err;
cafe563591446c Kent Overstreet 2013-03-23 552
cafe563591446c Kent Overstreet 2013-03-23 553 w->private = io;
cafe563591446c Kent Overstreet 2013-03-23 554 io->dc = dc;
6e6ccc67b9c7a6 Michael Lyle 2018-01-08 555 io->sequence = sequence++;
cafe563591446c Kent Overstreet 2013-03-23 556
cafe563591446c Kent Overstreet 2013-03-23 557 dirty_init(w);
c34b7ac6508755 Christoph Hellwig 2022-12-06 558 io->bio.bi_opf = REQ_OP_READ;
4f024f3797c43c Kent Overstreet 2013-10-11 559 io->bio.bi_iter.bi_sector = PTR_OFFSET(&w->key, 0);
11e9560e6c005b Christoph Hellwig 2021-04-11 560 bio_set_dev(&io->bio, dc->disk.c->cache->bdev);
cafe563591446c Kent Overstreet 2013-03-23 561 io->bio.bi_end_io = read_dirty_endio;
cafe563591446c Kent Overstreet 2013-03-23 562
25d8be77e19224 Ming Lei 2017-12-18 563 if (bch_bio_alloc_pages(&io->bio, GFP_KERNEL))
cafe563591446c Kent Overstreet 2013-03-23 564 goto err_free;
cafe563591446c Kent Overstreet 2013-03-23 565
c37511b863f36c Kent Overstreet 2013-04-26 566 trace_bcache_writeback(&w->key);
cafe563591446c Kent Overstreet 2013-03-23 567
c2a4f3183a1248 Kent Overstreet 2013-09-23 568 down(&dc->in_flight);
539d39eb270834 Tang Junhui 2018-01-08 569
3be11dbab67a3e Coly Li 2018-08-11 570 /*
3be11dbab67a3e Coly Li 2018-08-11 571 * We've acquired a semaphore for the maximum
539d39eb270834 Tang Junhui 2018-01-08 572 * simultaneous number of writebacks; from here
539d39eb270834 Tang Junhui 2018-01-08 573 * everything happens asynchronously.
539d39eb270834 Tang Junhui 2018-01-08 574 */
5e6926daac267d Kent Overstreet 2013-07-24 575 closure_call(&io->cl, read_dirty_submit, NULL, &cl);
539d39eb270834 Tang Junhui 2018-01-08 576 }
cafe563591446c Kent Overstreet 2013-03-23 577
539d39eb270834 Tang Junhui 2018-01-08 578 delay = writeback_delay(dc, size);
539d39eb270834 Tang Junhui 2018-01-08 579
771f393e8ffc9b Coly Li 2018-03-18 580 while (!kthread_should_stop() &&
771f393e8ffc9b Coly Li 2018-03-18 581 !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
771f393e8ffc9b Coly Li 2018-03-18 582 delay) {
539d39eb270834 Tang Junhui 2018-01-08 583 schedule_timeout_interruptible(delay);
539d39eb270834 Tang Junhui 2018-01-08 584 delay = writeback_delay(dc, 0);
539d39eb270834 Tang Junhui 2018-01-08 585 }
cafe563591446c Kent Overstreet 2013-03-23 586 }
cafe563591446c Kent Overstreet 2013-03-23 587
cafe563591446c Kent Overstreet 2013-03-23 588 if (0) {
cafe563591446c Kent Overstreet 2013-03-23 589 err_free:
cafe563591446c Kent Overstreet 2013-03-23 590 kfree(w->private);
cafe563591446c Kent Overstreet 2013-03-23 @591 err:
cafe563591446c Kent Overstreet 2013-03-23 592 }
cafe563591446c Kent Overstreet 2013-03-23 593
c2a4f3183a1248 Kent Overstreet 2013-09-23 594 /*
c2a4f3183a1248 Kent Overstreet 2013-09-23 595 * Wait for outstanding writeback IOs to finish (and keybuf slots to be
c2a4f3183a1248 Kent Overstreet 2013-09-23 596 * freed) before refilling again
c2a4f3183a1248 Kent Overstreet 2013-09-23 597 */
5e6926daac267d Kent Overstreet 2013-07-24 598 closure_sync(&cl);
448f60e858d871 Coly Li 2025-10-07 599
448f60e858d871 Coly Li 2025-10-07 600 if (atomic_read(&dc->writeback_keys.handled) == dump_nr) {
448f60e858d871 Coly Li 2025-10-07 601 spin_lock(&dc->writeback_keys.lock);
448f60e858d871 Coly Li 2025-10-07 602 dc->writeback_keys.keys = RB_ROOT;
448f60e858d871 Coly Li 2025-10-07 603 array_allocator_init(&dc->writeback_keys.freelist);
448f60e858d871 Coly Li 2025-10-07 604 spin_unlock(&dc->writeback_keys.lock);
448f60e858d871 Coly Li 2025-10-07 605 } else {
448f60e858d871 Coly Li 2025-10-07 606 for (i = 0; i < dump_nr; i++) {
448f60e858d871 Coly Li 2025-10-07 607 w = dump_keys[i];
448f60e858d871 Coly Li 2025-10-07 608 if (!w->private)
448f60e858d871 Coly Li 2025-10-07 609 continue;
448f60e858d871 Coly Li 2025-10-07 610 bch_keybuf_del(&dc->writeback_keys, w);
448f60e858d871 Coly Li 2025-10-07 611 }
448f60e858d871 Coly Li 2025-10-07 612 }
448f60e858d871 Coly Li 2025-10-07 613 atomic_set(&dc->writeback_keys.handled, 0);
5e6926daac267d Kent Overstreet 2013-07-24 614 }
5e6926daac267d Kent Overstreet 2013-07-24 615
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] bcache: avoid redundant access RB tree in read_dirty
2025-10-07 9:02 [PATCH] bcache: avoid redundant access RB tree in read_dirty colyli
2025-10-10 16:13 ` kernel test robot
2025-10-10 16:56 ` kernel test robot
@ 2025-10-20 16:44 ` Pierre Juhen
[not found] ` <050fe436-e629-4428-8e4d-33edd8985767@orange.fr>
3 siblings, 0 replies; 7+ messages in thread
From: Pierre Juhen @ 2025-10-20 16:44 UTC (permalink / raw)
To: colyli, linux-bcache; +Cc: Zhou Jifeng
Hi
I am on kernel 6.16.12.
I have had errors with bcache recently, And I lost my fronted 3 or 4 times :
oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 128:
bad csum, 32768 bytes, offset 0
oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 64:
bad csum, 22928 bytes, offset 0
oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 32:
bad csum, 4848 bytes, offset 2
oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 48:
bad csum, 14096 bytes, offset 0
oct. 20 15:37:40 pierre.juhen (udev-worker)[461]: nvme0n1p3: Process
'bcache-register /dev/nvme0n1p3' failed with exit code 1.
oct. 20 15:37:40 pierre.juhen kernel: bcache: prio_read() bad csum
reading priorities
oct. 20 15:37:40 pierre.juhen kernel: bcache: bch_cache_set_error()
error on 448f191c-28df-4396-bc44-14d1f77c9005: IO error reading
priorities, disabling caching
oct. 20 15:37:40 pierre.juhen kernel: bcache: register_bcache() error :
failed to register device
I had to reconfigure everything after a disk problem.
I have been running bcache for years now, without any problems.
The only difference might be that I configured the frontend with the
discard option.
The logical volume using bcache have also a discard option in fstab.
The frontend is on a Samsung 980 nvme disk.
Any hint ?
Thank you
Regards
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <050fe436-e629-4428-8e4d-33edd8985767@orange.fr>]
* Re: [PATCH] bcache: avoid redundant access RB tree in read_dirty
[not found] ` <050fe436-e629-4428-8e4d-33edd8985767@orange.fr>
@ 2025-10-21 1:18 ` Coly Li
2025-10-21 2:58 ` Discard option Pierre Juhen
0 siblings, 1 reply; 7+ messages in thread
From: Coly Li @ 2025-10-21 1:18 UTC (permalink / raw)
To: Pierre Juhen; +Cc: linux-bcache
> 2025年10月21日 00:39,Pierre Juhen <pierre.juhen@orange.fr> 写道:
>
> Hi
> I am on kernel 6.16.12.
> I have had errors with bcache recently, And I lost my fronted 3 or 4 times :
> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 128: bad csum, 32768 bytes, offset 0
> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 64: bad csum, 22928 bytes, offset 0
> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 32: bad csum, 4848 bytes, offset 2
> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 48: bad csum, 14096 bytes, offset 0
> oct. 20 15:37:40 pierre.juhen (udev-worker)[461]: nvme0n1p3: Process 'bcache-register /dev/nvme0n1p3' failed with exit code 1.
> oct. 20 15:37:40 pierre.juhen kernel: bcache: prio_read() bad csum reading priorities
> oct. 20 15:37:40 pierre.juhen kernel: bcache: bch_cache_set_error() error on 448f191c-28df-4396-bc44-14d1f77c9005: IO error reading priorities, disabling caching
> oct. 20 15:37:40 pierre.juhen kernel: bcache: register_bcache() error : failed to register device
>
I assume this email is irrelevant to the patch “bcache: avoid redundant access RB tree in read_dirty”, am I correct?
> I had to reconfigure everything after a disk problem.
> I have been running bcache for years now, without any problems.
> The only difference might be that I configured the frontend with the discard option.
The discard option is not recommended. Indeed in next merge window I will submit a patch series to drop the discard option.
> The logical volume using bcache have also a discard option in fstab.
> The frontend is on a Samsung 980 nvme disk.
Try not to enable discard on cache device. This option will disappear soon.
I don’t know whether discard option of Samsung 980 nvme disk may change the content of discarded LBA or not, from NVMe spec, it could be zero-filled or undefined.
Anyway in current code discard doesn’t help performance, I suggest to not enable discard and see whether the issue still shows up.
My suggestion is: always use default configuration, all our test case and performance optimization are for default configurations.
Thanks.
Coly Li
^ permalink raw reply [flat|nested] 7+ messages in thread* Discard option
2025-10-21 1:18 ` Coly Li
@ 2025-10-21 2:58 ` Pierre Juhen
2025-10-21 4:10 ` Coly Li
0 siblings, 1 reply; 7+ messages in thread
From: Pierre Juhen @ 2025-10-21 2:58 UTC (permalink / raw)
To: Coly Li; +Cc: linux-bcache
Hi Coli,
I assume this email is irrelevant to the patch “bcache: avoid redundant access RB tree in read_dirty”, am I correct?
Yes you are; sorry, I picked a message and forgot to change the title
The discard option is not recommended. Indeed in next merge window I will submit a patch series to drop the discard option.
OK, I will boot on a Live USB Key to wipe the caching device and create
a new one without discard option.
But should I keep the discard mount option in fstab for the logical
volumes inside bcache ?
Thanks,
Regards
_______________
Le 21/10/2025 à 03:18, Coly Li a écrit :
>> 2025年10月21日 00:39,Pierre Juhen <pierre.juhen@orange.fr> 写道:
>>
>> Hi
>> I am on kernel 6.16.12.
>> I have had errors with bcache recently, And I lost my fronted 3 or 4 times :
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 128: bad csum, 32768 bytes, offset 0
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 64: bad csum, 22928 bytes, offset 0
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 32: bad csum, 4848 bytes, offset 2
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 48: bad csum, 14096 bytes, offset 0
>> oct. 20 15:37:40 pierre.juhen (udev-worker)[461]: nvme0n1p3: Process 'bcache-register /dev/nvme0n1p3' failed with exit code 1.
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: prio_read() bad csum reading priorities
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: bch_cache_set_error() error on 448f191c-28df-4396-bc44-14d1f77c9005: IO error reading priorities, disabling caching
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: register_bcache() error : failed to register device
>>
> I assume this email is irrelevant to the patch “bcache: avoid redundant access RB tree in read_dirty”, am I correct?
>
>
>> I had to reconfigure everything after a disk problem.
>> I have been running bcache for years now, without any problems.
>> The only difference might be that I configured the frontend with the discard option.
> The discard option is not recommended. Indeed in next merge window I will submit a patch series to drop the discard option.
>
>
>> The logical volume using bcache have also a discard option in fstab.
>> The frontend is on a Samsung 980 nvme disk.
> Try not to enable discard on cache device. This option will disappear soon.
>
> I don’t know whether discard option of Samsung 980 nvme disk may change the content of discarded LBA or not, from NVMe spec, it could be zero-filled or undefined.
> Anyway in current code discard doesn’t help performance, I suggest to not enable discard and see whether the issue still shows up.
>
> My suggestion is: always use default configuration, all our test case and performance optimization are for default configurations.
>
> Thanks.
>
> Coly Li
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Discard option
2025-10-21 2:58 ` Discard option Pierre Juhen
@ 2025-10-21 4:10 ` Coly Li
0 siblings, 0 replies; 7+ messages in thread
From: Coly Li @ 2025-10-21 4:10 UTC (permalink / raw)
To: Pierre Juhen; +Cc: linux-bcache
On Tue, Oct 21, 2025 at 04:58:21AM +0800, Pierre Juhen wrote:
> Hi Coli,
>
> I assume this email is irrelevant to the patch “bcache: avoid redundant access RB tree in read_dirty”, am I correct?
>
> Yes you are; sorry, I picked a message and forgot to change the title
>
> The discard option is not recommended. Indeed in next merge window I will submit a patch series to drop the discard option.
>
> OK, I will boot on a Live USB Key to wipe the caching device and create a
> new one without discard option.
>
> But should I keep the discard mount option in fstab for the logical volumes
> inside bcache ?
>
Just not enable the bcache internal discard is fine. Bcache handles outside
discard bios properly, if this is a file system mounted on top of a bcache
device, it is fine to keep discard option of the file system.
>
> _______________
>
> Le 21/10/2025 à 03:18, Coly Li a écrit :
> > > 2025年10月21日 00:39,Pierre Juhen <pierre.juhen@orange.fr> 写道:
> > >
> > > Hi
> > > I am on kernel 6.16.12.
> > > I have had errors with bcache recently, And I lost my fronted 3 or 4 times :
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 128: bad csum, 32768 bytes, offset 0
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 64: bad csum, 22928 bytes, offset 0
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 32: bad csum, 4848 bytes, offset 2
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 48: bad csum, 14096 bytes, offset 0
> > > oct. 20 15:37:40 pierre.juhen (udev-worker)[461]: nvme0n1p3: Process 'bcache-register /dev/nvme0n1p3' failed with exit code 1.
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: prio_read() bad csum reading priorities
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: bch_cache_set_error() error on 448f191c-28df-4396-bc44-14d1f77c9005: IO error reading priorities, disabling caching
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: register_bcache() error : failed to register device
> > >
> > I assume this email is irrelevant to the patch “bcache: avoid redundant access RB tree in read_dirty”, am I correct?
> >
> >
> > > I had to reconfigure everything after a disk problem.
> > > I have been running bcache for years now, without any problems.
> > > The only difference might be that I configured the frontend with the discard option.
> > The discard option is not recommended. Indeed in next merge window I will submit a patch series to drop the discard option.
> >
> >
> > > The logical volume using bcache have also a discard option in fstab.
> > > The frontend is on a Samsung 980 nvme disk.
> > Try not to enable discard on cache device. This option will disappear soon.
> >
> > I don’t know whether discard option of Samsung 980 nvme disk may change the content of discarded LBA or not, from NVMe spec, it could be zero-filled or undefined.
> > Anyway in current code discard doesn’t help performance, I suggest to not enable discard and see whether the issue still shows up.
> >
> > My suggestion is: always use default configuration, all our test case and performance optimization are for default configurations.
> >
> > Thanks.
> >
> > Coly Li
> >
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-21 4:11 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-07 9:02 [PATCH] bcache: avoid redundant access RB tree in read_dirty colyli
2025-10-10 16:13 ` kernel test robot
2025-10-10 16:56 ` kernel test robot
2025-10-20 16:44 ` Pierre Juhen
[not found] ` <050fe436-e629-4428-8e4d-33edd8985767@orange.fr>
2025-10-21 1:18 ` Coly Li
2025-10-21 2:58 ` Discard option Pierre Juhen
2025-10-21 4:10 ` Coly Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox