Linux bcache driver list
 help / color / mirror / Atom feed
* [PATCH] bcache: avoid redundant access RB tree in read_dirty
@ 2025-10-07  9:02 colyli
  2025-10-10 16:13 ` kernel test robot
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: colyli @ 2025-10-07  9:02 UTC (permalink / raw)
  To: linux-bcache; +Cc: Coly Li, Zhou Jifeng

From: Coly Li <colyli@fnnas.com>

In bcache writeback procedure, scanned dirty keys are stored in a
red-black tree, it is dc->writeback_keys.freelist, and can be indexed by
dc->writeback_keys.keys. Inside red_dirty() one dirty key is fetched by
calling bch_keybuf_next() in each while-loop until one of the following
situations meets,
- The fetched keys number reaches MAX_WRTEBACKS_IN_PASS.
- The total size of the fetched keys reaches MAX_WRITESIZE_IN_PASS.
- All nodes in the red-black tree are iterated.

The above process is reasonable, but calling bch_keybuf_next() in each
while-loop is inefficient. Let me explain why. Let's see its code,
2767 struct keybuf_key *bch_keybuf_next(struct keybuf *buf)
2768 {
2769         struct keybuf_key *w;
2770
2771         spin_lock(&buf->lock);
2772
2773         w = RB_FIRST(&buf->keys, struct keybuf_key, node);
2774
2775         while (w && w->private)
2776                 w = RB_NEXT(w, node);
2777
2778         if (w)
2779                 w->private = ERR_PTR(-EINTR);
2780
2781         spin_unlock(&buf->lock);
2782         return w;
2783 }

Every time when bch_keybuf_next() is called, the red-black tree is
iterated from start until a node doesn't have w->private value. Then
this node's w->private is set by ERR_PTR(-EINTR) and pointer of the
node's holder (struct keybuf_key) is returned back to read_dirty().

In worst case, if there are 500 nodes in this red-black tree, nodes to
be iterated in each calling of bch_keybuf_next() in read_dirty() will
be: 1, 2, 3, 4, ......, 498, 499, 500. The total nodes iteration times
are 125250. If all the nodes can be fetched from read_dirty() once, the
iteration times are only 500, it is about 0.3% of the original number.

This patch adds new member dump_keys[KEYBUF_NR], it is used to store
one-time fetched dirty keys from the red-black trees. All the keys in
dump_keys[] array is in their original iterated order. A new function
bch_keybuf_dump() is used to fetch all ndoes from the red-black tree and
store the selected keys in dump_keys[]. In side read_dirty(), this new
function is called before entering while-loop, and bch_keybuf_next() is
not used anymore. Now the red-black tree only be iterated once, also the
spinlock buf->lock is only acquired once.

For the situation that not all nodes inside dump_keys[] are handled, the
non-handled nodes should be fetched next time when read_dirty() is
called again. So inside the red-black tree, w->private is not set by
ERR_PTR(-EINTR) when the node is fetched into dump_keys[]. It is set in
write_dirty_finish(), that means this node is marked as handled when the
dirty data of the key is written backed to backing device. This is a
change should be noticed.

Inside read_dirty() after closure_sync() returned, it means all write-
back bios are completed, then patch will check whether all fetched nodes
are written back. Because items in dump_keys[] array are exactly equal
to nodes in the red-black tree, if all the keys in dump_keys[] are
handled, the red-black tree is reset directly by array_allocator_init().
Only when unexpected condition happens and there are some items inside
dump_keys[] array not handled, bch_keybuf_del() is called to delete the
written dirty keys one by one from the red-black tree.

In practice, testing shows most of the time all items in dump_keys[] are
all handled promptly, the red-black tree is directly reset.

Answers to some potential concerns:
- Memory barrier issue for dc->writeback_keys.hangled.
    This atomic_t value is increased inside semaphore dc->in_flight in
  write_dirty_finish(). Calling up(&dc->in_flight) has implicit memory
  barrier inside, no need to redundant explicit memory barrier call. 
- Concurrent access to the red-black tree in dc->writeback_keys.freelist
    Access to dc->writeback_keys.freelist is guarded by
  dc->writeback_keys.lock, it is safe for concurrent access on it.
  Indeed before returning from read_dirty(), no one else will access
  this red-black tree (no matter adding new nodes or setting w->private)
  it is also safe to check w->private inside read_dirty() without
  holding dc->writeback_keys.lock after closure_sync() returned. 

The idea was inspired by code review comments of previous patches from
Zhou Jifeng.


Signed-off-by: Coly Li <colyli@fnnas.com> 
Cc: Zhou Jifeng <zhoujifeng@kylinos.com.cn>
---
 drivers/md/bcache/bcache.h    |  2 ++
 drivers/md/bcache/btree.c     | 28 ++++++++++++++++++++++
 drivers/md/bcache/btree.h     |  2 ++
 drivers/md/bcache/writeback.c | 45 +++++++++++++++++++++++++----------
 4 files changed, 65 insertions(+), 12 deletions(-)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 8ccacba85547..06880e2d5b1d 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -240,11 +240,13 @@ struct keybuf {
 	 */
 	struct bkey		start;
 	struct bkey		end;
+	atomic_t		handled;
 
 	struct rb_root		keys;
 
 #define KEYBUF_NR		500
 	DECLARE_ARRAY_ALLOCATOR(struct keybuf_key, freelist, KEYBUF_NR);
+	struct keybuf_key	*dump_keys[KEYBUF_NR];
 };
 
 struct bcache_device {
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index bdb90833bff0..e107d0f05ee4 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -2763,6 +2763,34 @@ bool bch_keybuf_check_overlapping(struct keybuf *buf, struct bkey *start,
 	return ret;
 }
 
+int bch_keybuf_dump(struct keybuf *buf, struct keybuf_key *dump_list[],
+		    int dump_list_len)
+{
+	struct keybuf_key *w;
+	int i = 0;
+
+	memset(dump_list, 0, dump_list_len * sizeof(struct keybuf_key *));
+
+	spin_lock(&buf->lock);
+
+	w = RB_FIRST(&buf->keys, struct keybuf_key, node);
+
+	while (w && i < dump_list_len) {
+		if (w->private) {
+			w = RB_NEXT(w, node);
+			continue;
+		}
+
+		dump_list[i++] = w;
+		w->private = ERR_PTR(-EINTR);
+		w = RB_NEXT(w, node);
+	}
+
+	spin_unlock(&buf->lock);
+
+	return i;
+}
+
 struct keybuf_key *bch_keybuf_next(struct keybuf *buf)
 {
 	struct keybuf_key *w;
diff --git a/drivers/md/bcache/btree.h b/drivers/md/bcache/btree.h
index 45d64b54115a..65d81a7b5ac6 100644
--- a/drivers/md/bcache/btree.h
+++ b/drivers/md/bcache/btree.h
@@ -413,5 +413,7 @@ struct keybuf_key *bch_keybuf_next_rescan(struct cache_set *c,
 					  struct keybuf *buf,
 					  struct bkey *end,
 					  keybuf_pred_fn *pred);
+int bch_keybuf_dump(struct keybuf *buf, struct keybuf_key *dump_list[],
+		   int dump_list_len);
 void bch_update_bucket_in_use(struct cache_set *c, struct gc_stat *stats);
 #endif
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 59cd0c3f8ce9..e6c548e83ff1 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -381,7 +381,8 @@ static CLOSURE_CALLBACK(write_dirty_finish)
 				: &dc->disk.c->writeback_keys_done);
 	}
 
-	bch_keybuf_del(&dc->writeback_keys, w);
+	w->private = ERR_PTR(-EINTR);
+	atomic_inc(&dc->writeback_keys.handled);
 	up(&dc->in_flight);
 
 	closure_return_with_destructor(cl, dirty_io_destructor);
@@ -474,8 +475,10 @@ static CLOSURE_CALLBACK(read_dirty_submit)
 static void read_dirty(struct cached_dev *dc)
 {
 	unsigned int delay = 0;
-	struct keybuf_key *next, *keys[MAX_WRITEBACKS_IN_PASS], *w;
+	struct keybuf_key *keys[MAX_WRITEBACKS_IN_PASS], *w;
+	struct keybuf_key **dump_keys;
 	size_t size;
+	int checked, dump_nr;
 	int nk, i;
 	struct dirty_io *io;
 	struct closure cl;
@@ -489,17 +492,22 @@ static void read_dirty(struct cached_dev *dc)
 	 * XXX: if we error, background writeback just spins. Should use some
 	 * mempools.
 	 */
-
-	next = bch_keybuf_next(&dc->writeback_keys);
+	dump_nr = bch_keybuf_dump(&dc->writeback_keys,
+			dc->writeback_keys.dump_keys,
+			ARRAY_SIZE(dc->writeback_keys.dump_keys));
+	dump_keys = dc->writeback_keys.dump_keys;
+	atomic_set(&dc->writeback_keys.handled, 0);
+	checked = 0;
 
 	while (!kthread_should_stop() &&
 	       !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
-	       next) {
+	       (checked < dump_nr)) {
 		size = 0;
 		nk = 0;
 
 		do {
-			BUG_ON(ptr_stale(dc->disk.c, &next->key, 0));
+			w = dump_keys[checked];
+			BUG_ON(ptr_stale(dc->disk.c, &w->key, 0));
 
 			/*
 			 * Don't combine too many operations, even if they
@@ -525,12 +533,12 @@ static void read_dirty(struct cached_dev *dc)
 			 * command queueing.
 			 */
 			if ((nk != 0) && bkey_cmp(&keys[nk-1]->key,
-						&START_KEY(&next->key)))
+						&START_KEY(&w->key)))
 				break;
 
-			size += KEY_SIZE(&next->key);
-			keys[nk++] = next;
-		} while ((next = bch_keybuf_next(&dc->writeback_keys)));
+			size += KEY_SIZE(&w->key);
+			keys[nk++] = w;
+		} while (++checked < dump_nr);
 
 		/* Now we have gathered a set of 1..5 keys to write back. */
 		for (i = 0; i < nk; i++) {
@@ -581,7 +589,6 @@ static void read_dirty(struct cached_dev *dc)
 err_free:
 		kfree(w->private);
 err:
-		bch_keybuf_del(&dc->writeback_keys, w);
 	}
 
 	/*
@@ -589,6 +596,21 @@ static void read_dirty(struct cached_dev *dc)
 	 * freed) before refilling again
 	 */
 	closure_sync(&cl);
+
+	if (atomic_read(&dc->writeback_keys.handled) == dump_nr) {
+		spin_lock(&dc->writeback_keys.lock);
+		dc->writeback_keys.keys = RB_ROOT;
+		array_allocator_init(&dc->writeback_keys.freelist);
+		spin_unlock(&dc->writeback_keys.lock);
+	} else {
+		for (i = 0; i < dump_nr; i++) {
+			w = dump_keys[i];
+			if (!w->private)
+				continue;
+			bch_keybuf_del(&dc->writeback_keys, w);
+		}
+	}
+	atomic_set(&dc->writeback_keys.handled, 0);
 }
 
 /* Scan for dirty data */
@@ -820,7 +842,6 @@ static int bch_writeback_thread(void *arg)
 
 		if (searched_full_index) {
 			unsigned int delay = dc->writeback_delay * HZ;
-
 			while (delay &&
 			       !kthread_should_stop() &&
 			       !test_bit(CACHE_SET_IO_DISABLE, &c->flags) &&
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] bcache: avoid redundant access RB tree in read_dirty
  2025-10-07  9:02 [PATCH] bcache: avoid redundant access RB tree in read_dirty colyli
@ 2025-10-10 16:13 ` kernel test robot
  2025-10-10 16:56 ` kernel test robot
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2025-10-10 16:13 UTC (permalink / raw)
  To: colyli, linux-bcache; +Cc: llvm, oe-kbuild-all, Coly Li, Zhou Jifeng

Hi,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.17 next-20251010]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/colyli-fnnas-com/bcache-avoid-redundant-access-RB-tree-in-read_dirty/20251010-103843
base:   linus/master
patch link:    https://lore.kernel.org/r/20251007090232.30386-1-colyli%40fnnas.com
patch subject: [PATCH] bcache: avoid redundant access RB tree in read_dirty
config: x86_64-buildonly-randconfig-001-20251010 (https://download.01.org/0day-ci/archive/20251011/202510110048.PdCCuB3b-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251011/202510110048.PdCCuB3b-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510110048.PdCCuB3b-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/md/bcache/writeback.c:592:2: warning: label at end of compound statement is a C23 extension [-Wc23-extensions]
     592 |         }
         |         ^
   1 warning generated.


vim +592 drivers/md/bcache/writeback.c

cafe563591446c Kent Overstreet   2013-03-23  474  
5e6926daac267d Kent Overstreet   2013-07-24  475  static void read_dirty(struct cached_dev *dc)
cafe563591446c Kent Overstreet   2013-03-23  476  {
6f10f7d1b02b1b Coly Li           2018-08-11  477  	unsigned int delay = 0;
448f60e858d871 Coly Li           2025-10-07  478  	struct keybuf_key *keys[MAX_WRITEBACKS_IN_PASS], *w;
448f60e858d871 Coly Li           2025-10-07  479  	struct keybuf_key **dump_keys;
539d39eb270834 Tang Junhui       2018-01-08  480  	size_t size;
448f60e858d871 Coly Li           2025-10-07  481  	int checked, dump_nr;
539d39eb270834 Tang Junhui       2018-01-08  482  	int nk, i;
cafe563591446c Kent Overstreet   2013-03-23  483  	struct dirty_io *io;
5e6926daac267d Kent Overstreet   2013-07-24  484  	struct closure cl;
6e6ccc67b9c7a6 Michael Lyle      2018-01-08  485  	uint16_t sequence = 0;
5e6926daac267d Kent Overstreet   2013-07-24  486  
6e6ccc67b9c7a6 Michael Lyle      2018-01-08  487  	BUG_ON(!llist_empty(&dc->writeback_ordering_wait.list));
6e6ccc67b9c7a6 Michael Lyle      2018-01-08  488  	atomic_set(&dc->writeback_sequence_next, sequence);
5e6926daac267d Kent Overstreet   2013-07-24  489  	closure_init_stack(&cl);
cafe563591446c Kent Overstreet   2013-03-23  490  
cafe563591446c Kent Overstreet   2013-03-23  491  	/*
cafe563591446c Kent Overstreet   2013-03-23  492  	 * XXX: if we error, background writeback just spins. Should use some
cafe563591446c Kent Overstreet   2013-03-23  493  	 * mempools.
cafe563591446c Kent Overstreet   2013-03-23  494  	 */
448f60e858d871 Coly Li           2025-10-07  495  	dump_nr = bch_keybuf_dump(&dc->writeback_keys,
448f60e858d871 Coly Li           2025-10-07  496  			dc->writeback_keys.dump_keys,
448f60e858d871 Coly Li           2025-10-07  497  			ARRAY_SIZE(dc->writeback_keys.dump_keys));
448f60e858d871 Coly Li           2025-10-07  498  	dump_keys = dc->writeback_keys.dump_keys;
448f60e858d871 Coly Li           2025-10-07  499  	atomic_set(&dc->writeback_keys.handled, 0);
448f60e858d871 Coly Li           2025-10-07  500  	checked = 0;
5e6926daac267d Kent Overstreet   2013-07-24  501  
771f393e8ffc9b Coly Li           2018-03-18  502  	while (!kthread_should_stop() &&
771f393e8ffc9b Coly Li           2018-03-18  503  	       !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
448f60e858d871 Coly Li           2025-10-07  504  	       (checked < dump_nr)) {
539d39eb270834 Tang Junhui       2018-01-08  505  		size = 0;
539d39eb270834 Tang Junhui       2018-01-08  506  		nk = 0;
539d39eb270834 Tang Junhui       2018-01-08  507  
539d39eb270834 Tang Junhui       2018-01-08  508  		do {
448f60e858d871 Coly Li           2025-10-07  509  			w = dump_keys[checked];
448f60e858d871 Coly Li           2025-10-07  510  			BUG_ON(ptr_stale(dc->disk.c, &w->key, 0));
539d39eb270834 Tang Junhui       2018-01-08  511  
539d39eb270834 Tang Junhui       2018-01-08  512  			/*
539d39eb270834 Tang Junhui       2018-01-08  513  			 * Don't combine too many operations, even if they
539d39eb270834 Tang Junhui       2018-01-08  514  			 * are all small.
539d39eb270834 Tang Junhui       2018-01-08  515  			 */
539d39eb270834 Tang Junhui       2018-01-08  516  			if (nk >= MAX_WRITEBACKS_IN_PASS)
cafe563591446c Kent Overstreet   2013-03-23  517  				break;
cafe563591446c Kent Overstreet   2013-03-23  518  
539d39eb270834 Tang Junhui       2018-01-08  519  			/*
539d39eb270834 Tang Junhui       2018-01-08  520  			 * If the current operation is very large, don't
539d39eb270834 Tang Junhui       2018-01-08  521  			 * further combine operations.
539d39eb270834 Tang Junhui       2018-01-08  522  			 */
539d39eb270834 Tang Junhui       2018-01-08  523  			if (size >= MAX_WRITESIZE_IN_PASS)
539d39eb270834 Tang Junhui       2018-01-08  524  				break;
cafe563591446c Kent Overstreet   2013-03-23  525  
539d39eb270834 Tang Junhui       2018-01-08  526  			/*
539d39eb270834 Tang Junhui       2018-01-08  527  			 * Operations are only eligible to be combined
539d39eb270834 Tang Junhui       2018-01-08  528  			 * if they are contiguous.
539d39eb270834 Tang Junhui       2018-01-08  529  			 *
539d39eb270834 Tang Junhui       2018-01-08  530  			 * TODO: add a heuristic willing to fire a
539d39eb270834 Tang Junhui       2018-01-08  531  			 * certain amount of non-contiguous IO per pass,
539d39eb270834 Tang Junhui       2018-01-08  532  			 * so that we can benefit from backing device
539d39eb270834 Tang Junhui       2018-01-08  533  			 * command queueing.
539d39eb270834 Tang Junhui       2018-01-08  534  			 */
539d39eb270834 Tang Junhui       2018-01-08  535  			if ((nk != 0) && bkey_cmp(&keys[nk-1]->key,
448f60e858d871 Coly Li           2025-10-07  536  						&START_KEY(&w->key)))
539d39eb270834 Tang Junhui       2018-01-08  537  				break;
cafe563591446c Kent Overstreet   2013-03-23  538  
448f60e858d871 Coly Li           2025-10-07  539  			size += KEY_SIZE(&w->key);
448f60e858d871 Coly Li           2025-10-07  540  			keys[nk++] = w;
448f60e858d871 Coly Li           2025-10-07  541  		} while (++checked < dump_nr);
cafe563591446c Kent Overstreet   2013-03-23  542  
539d39eb270834 Tang Junhui       2018-01-08  543  		/* Now we have gathered a set of 1..5 keys to write back. */
539d39eb270834 Tang Junhui       2018-01-08  544  		for (i = 0; i < nk; i++) {
539d39eb270834 Tang Junhui       2018-01-08  545  			w = keys[i];
539d39eb270834 Tang Junhui       2018-01-08  546  
d86eaa0f3c56da Christoph Hellwig 2025-09-08  547  			io = kzalloc(sizeof(*io) + sizeof(struct bio_vec) *
d86eaa0f3c56da Christoph Hellwig 2025-09-08  548  				DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS),
cafe563591446c Kent Overstreet   2013-03-23  549  				GFP_KERNEL);
cafe563591446c Kent Overstreet   2013-03-23  550  			if (!io)
cafe563591446c Kent Overstreet   2013-03-23  551  				goto err;
cafe563591446c Kent Overstreet   2013-03-23  552  
cafe563591446c Kent Overstreet   2013-03-23  553  			w->private	= io;
cafe563591446c Kent Overstreet   2013-03-23  554  			io->dc		= dc;
6e6ccc67b9c7a6 Michael Lyle      2018-01-08  555  			io->sequence    = sequence++;
cafe563591446c Kent Overstreet   2013-03-23  556  
cafe563591446c Kent Overstreet   2013-03-23  557  			dirty_init(w);
c34b7ac6508755 Christoph Hellwig 2022-12-06  558  			io->bio.bi_opf = REQ_OP_READ;
4f024f3797c43c Kent Overstreet   2013-10-11  559  			io->bio.bi_iter.bi_sector = PTR_OFFSET(&w->key, 0);
11e9560e6c005b Christoph Hellwig 2021-04-11  560  			bio_set_dev(&io->bio, dc->disk.c->cache->bdev);
cafe563591446c Kent Overstreet   2013-03-23  561  			io->bio.bi_end_io	= read_dirty_endio;
cafe563591446c Kent Overstreet   2013-03-23  562  
25d8be77e19224 Ming Lei          2017-12-18  563  			if (bch_bio_alloc_pages(&io->bio, GFP_KERNEL))
cafe563591446c Kent Overstreet   2013-03-23  564  				goto err_free;
cafe563591446c Kent Overstreet   2013-03-23  565  
c37511b863f36c Kent Overstreet   2013-04-26  566  			trace_bcache_writeback(&w->key);
cafe563591446c Kent Overstreet   2013-03-23  567  
c2a4f3183a1248 Kent Overstreet   2013-09-23  568  			down(&dc->in_flight);
539d39eb270834 Tang Junhui       2018-01-08  569  
3be11dbab67a3e Coly Li           2018-08-11  570  			/*
3be11dbab67a3e Coly Li           2018-08-11  571  			 * We've acquired a semaphore for the maximum
539d39eb270834 Tang Junhui       2018-01-08  572  			 * simultaneous number of writebacks; from here
539d39eb270834 Tang Junhui       2018-01-08  573  			 * everything happens asynchronously.
539d39eb270834 Tang Junhui       2018-01-08  574  			 */
5e6926daac267d Kent Overstreet   2013-07-24  575  			closure_call(&io->cl, read_dirty_submit, NULL, &cl);
539d39eb270834 Tang Junhui       2018-01-08  576  		}
cafe563591446c Kent Overstreet   2013-03-23  577  
539d39eb270834 Tang Junhui       2018-01-08  578  		delay = writeback_delay(dc, size);
539d39eb270834 Tang Junhui       2018-01-08  579  
771f393e8ffc9b Coly Li           2018-03-18  580  		while (!kthread_should_stop() &&
771f393e8ffc9b Coly Li           2018-03-18  581  		       !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
771f393e8ffc9b Coly Li           2018-03-18  582  		       delay) {
539d39eb270834 Tang Junhui       2018-01-08  583  			schedule_timeout_interruptible(delay);
539d39eb270834 Tang Junhui       2018-01-08  584  			delay = writeback_delay(dc, 0);
539d39eb270834 Tang Junhui       2018-01-08  585  		}
cafe563591446c Kent Overstreet   2013-03-23  586  	}
cafe563591446c Kent Overstreet   2013-03-23  587  
cafe563591446c Kent Overstreet   2013-03-23  588  	if (0) {
cafe563591446c Kent Overstreet   2013-03-23  589  err_free:
cafe563591446c Kent Overstreet   2013-03-23  590  		kfree(w->private);
cafe563591446c Kent Overstreet   2013-03-23  591  err:
cafe563591446c Kent Overstreet   2013-03-23 @592  	}
cafe563591446c Kent Overstreet   2013-03-23  593  
c2a4f3183a1248 Kent Overstreet   2013-09-23  594  	/*
c2a4f3183a1248 Kent Overstreet   2013-09-23  595  	 * Wait for outstanding writeback IOs to finish (and keybuf slots to be
c2a4f3183a1248 Kent Overstreet   2013-09-23  596  	 * freed) before refilling again
c2a4f3183a1248 Kent Overstreet   2013-09-23  597  	 */
5e6926daac267d Kent Overstreet   2013-07-24  598  	closure_sync(&cl);
448f60e858d871 Coly Li           2025-10-07  599  
448f60e858d871 Coly Li           2025-10-07  600  	if (atomic_read(&dc->writeback_keys.handled) == dump_nr) {
448f60e858d871 Coly Li           2025-10-07  601  		spin_lock(&dc->writeback_keys.lock);
448f60e858d871 Coly Li           2025-10-07  602  		dc->writeback_keys.keys = RB_ROOT;
448f60e858d871 Coly Li           2025-10-07  603  		array_allocator_init(&dc->writeback_keys.freelist);
448f60e858d871 Coly Li           2025-10-07  604  		spin_unlock(&dc->writeback_keys.lock);
448f60e858d871 Coly Li           2025-10-07  605  	} else {
448f60e858d871 Coly Li           2025-10-07  606  		for (i = 0; i < dump_nr; i++) {
448f60e858d871 Coly Li           2025-10-07  607  			w = dump_keys[i];
448f60e858d871 Coly Li           2025-10-07  608  			if (!w->private)
448f60e858d871 Coly Li           2025-10-07  609  				continue;
448f60e858d871 Coly Li           2025-10-07  610  			bch_keybuf_del(&dc->writeback_keys, w);
448f60e858d871 Coly Li           2025-10-07  611  		}
448f60e858d871 Coly Li           2025-10-07  612  	}
448f60e858d871 Coly Li           2025-10-07  613  	atomic_set(&dc->writeback_keys.handled, 0);
5e6926daac267d Kent Overstreet   2013-07-24  614  }
5e6926daac267d Kent Overstreet   2013-07-24  615  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bcache: avoid redundant access RB tree in read_dirty
  2025-10-07  9:02 [PATCH] bcache: avoid redundant access RB tree in read_dirty colyli
  2025-10-10 16:13 ` kernel test robot
@ 2025-10-10 16:56 ` kernel test robot
  2025-10-20 16:44 ` Pierre Juhen
       [not found] ` <050fe436-e629-4428-8e4d-33edd8985767@orange.fr>
  3 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2025-10-10 16:56 UTC (permalink / raw)
  To: colyli, linux-bcache; +Cc: oe-kbuild-all, Coly Li, Zhou Jifeng

Hi,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.17 next-20251010]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/colyli-fnnas-com/bcache-avoid-redundant-access-RB-tree-in-read_dirty/20251010-103843
base:   linus/master
patch link:    https://lore.kernel.org/r/20251007090232.30386-1-colyli%40fnnas.com
patch subject: [PATCH] bcache: avoid redundant access RB tree in read_dirty
config: sparc-randconfig-002-20251010 (https://download.01.org/0day-ci/archive/20251011/202510110055.zirjAQZU-lkp@intel.com/config)
compiler: sparc64-linux-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251011/202510110055.zirjAQZU-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510110055.zirjAQZU-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/md/bcache/writeback.c: In function 'read_dirty':
>> drivers/md/bcache/writeback.c:591:1: error: label at end of compound statement
    err:
    ^~~


vim +591 drivers/md/bcache/writeback.c

cafe563591446c Kent Overstreet   2013-03-23  474  
5e6926daac267d Kent Overstreet   2013-07-24  475  static void read_dirty(struct cached_dev *dc)
cafe563591446c Kent Overstreet   2013-03-23  476  {
6f10f7d1b02b1b Coly Li           2018-08-11  477  	unsigned int delay = 0;
448f60e858d871 Coly Li           2025-10-07  478  	struct keybuf_key *keys[MAX_WRITEBACKS_IN_PASS], *w;
448f60e858d871 Coly Li           2025-10-07  479  	struct keybuf_key **dump_keys;
539d39eb270834 Tang Junhui       2018-01-08  480  	size_t size;
448f60e858d871 Coly Li           2025-10-07  481  	int checked, dump_nr;
539d39eb270834 Tang Junhui       2018-01-08  482  	int nk, i;
cafe563591446c Kent Overstreet   2013-03-23  483  	struct dirty_io *io;
5e6926daac267d Kent Overstreet   2013-07-24  484  	struct closure cl;
6e6ccc67b9c7a6 Michael Lyle      2018-01-08  485  	uint16_t sequence = 0;
5e6926daac267d Kent Overstreet   2013-07-24  486  
6e6ccc67b9c7a6 Michael Lyle      2018-01-08  487  	BUG_ON(!llist_empty(&dc->writeback_ordering_wait.list));
6e6ccc67b9c7a6 Michael Lyle      2018-01-08  488  	atomic_set(&dc->writeback_sequence_next, sequence);
5e6926daac267d Kent Overstreet   2013-07-24  489  	closure_init_stack(&cl);
cafe563591446c Kent Overstreet   2013-03-23  490  
cafe563591446c Kent Overstreet   2013-03-23  491  	/*
cafe563591446c Kent Overstreet   2013-03-23  492  	 * XXX: if we error, background writeback just spins. Should use some
cafe563591446c Kent Overstreet   2013-03-23  493  	 * mempools.
cafe563591446c Kent Overstreet   2013-03-23  494  	 */
448f60e858d871 Coly Li           2025-10-07  495  	dump_nr = bch_keybuf_dump(&dc->writeback_keys,
448f60e858d871 Coly Li           2025-10-07  496  			dc->writeback_keys.dump_keys,
448f60e858d871 Coly Li           2025-10-07  497  			ARRAY_SIZE(dc->writeback_keys.dump_keys));
448f60e858d871 Coly Li           2025-10-07  498  	dump_keys = dc->writeback_keys.dump_keys;
448f60e858d871 Coly Li           2025-10-07  499  	atomic_set(&dc->writeback_keys.handled, 0);
448f60e858d871 Coly Li           2025-10-07  500  	checked = 0;
5e6926daac267d Kent Overstreet   2013-07-24  501  
771f393e8ffc9b Coly Li           2018-03-18  502  	while (!kthread_should_stop() &&
771f393e8ffc9b Coly Li           2018-03-18  503  	       !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
448f60e858d871 Coly Li           2025-10-07  504  	       (checked < dump_nr)) {
539d39eb270834 Tang Junhui       2018-01-08  505  		size = 0;
539d39eb270834 Tang Junhui       2018-01-08  506  		nk = 0;
539d39eb270834 Tang Junhui       2018-01-08  507  
539d39eb270834 Tang Junhui       2018-01-08  508  		do {
448f60e858d871 Coly Li           2025-10-07  509  			w = dump_keys[checked];
448f60e858d871 Coly Li           2025-10-07  510  			BUG_ON(ptr_stale(dc->disk.c, &w->key, 0));
539d39eb270834 Tang Junhui       2018-01-08  511  
539d39eb270834 Tang Junhui       2018-01-08  512  			/*
539d39eb270834 Tang Junhui       2018-01-08  513  			 * Don't combine too many operations, even if they
539d39eb270834 Tang Junhui       2018-01-08  514  			 * are all small.
539d39eb270834 Tang Junhui       2018-01-08  515  			 */
539d39eb270834 Tang Junhui       2018-01-08  516  			if (nk >= MAX_WRITEBACKS_IN_PASS)
cafe563591446c Kent Overstreet   2013-03-23  517  				break;
cafe563591446c Kent Overstreet   2013-03-23  518  
539d39eb270834 Tang Junhui       2018-01-08  519  			/*
539d39eb270834 Tang Junhui       2018-01-08  520  			 * If the current operation is very large, don't
539d39eb270834 Tang Junhui       2018-01-08  521  			 * further combine operations.
539d39eb270834 Tang Junhui       2018-01-08  522  			 */
539d39eb270834 Tang Junhui       2018-01-08  523  			if (size >= MAX_WRITESIZE_IN_PASS)
539d39eb270834 Tang Junhui       2018-01-08  524  				break;
cafe563591446c Kent Overstreet   2013-03-23  525  
539d39eb270834 Tang Junhui       2018-01-08  526  			/*
539d39eb270834 Tang Junhui       2018-01-08  527  			 * Operations are only eligible to be combined
539d39eb270834 Tang Junhui       2018-01-08  528  			 * if they are contiguous.
539d39eb270834 Tang Junhui       2018-01-08  529  			 *
539d39eb270834 Tang Junhui       2018-01-08  530  			 * TODO: add a heuristic willing to fire a
539d39eb270834 Tang Junhui       2018-01-08  531  			 * certain amount of non-contiguous IO per pass,
539d39eb270834 Tang Junhui       2018-01-08  532  			 * so that we can benefit from backing device
539d39eb270834 Tang Junhui       2018-01-08  533  			 * command queueing.
539d39eb270834 Tang Junhui       2018-01-08  534  			 */
539d39eb270834 Tang Junhui       2018-01-08  535  			if ((nk != 0) && bkey_cmp(&keys[nk-1]->key,
448f60e858d871 Coly Li           2025-10-07  536  						&START_KEY(&w->key)))
539d39eb270834 Tang Junhui       2018-01-08  537  				break;
cafe563591446c Kent Overstreet   2013-03-23  538  
448f60e858d871 Coly Li           2025-10-07  539  			size += KEY_SIZE(&w->key);
448f60e858d871 Coly Li           2025-10-07  540  			keys[nk++] = w;
448f60e858d871 Coly Li           2025-10-07  541  		} while (++checked < dump_nr);
cafe563591446c Kent Overstreet   2013-03-23  542  
539d39eb270834 Tang Junhui       2018-01-08  543  		/* Now we have gathered a set of 1..5 keys to write back. */
539d39eb270834 Tang Junhui       2018-01-08  544  		for (i = 0; i < nk; i++) {
539d39eb270834 Tang Junhui       2018-01-08  545  			w = keys[i];
539d39eb270834 Tang Junhui       2018-01-08  546  
d86eaa0f3c56da Christoph Hellwig 2025-09-08  547  			io = kzalloc(sizeof(*io) + sizeof(struct bio_vec) *
d86eaa0f3c56da Christoph Hellwig 2025-09-08  548  				DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS),
cafe563591446c Kent Overstreet   2013-03-23  549  				GFP_KERNEL);
cafe563591446c Kent Overstreet   2013-03-23  550  			if (!io)
cafe563591446c Kent Overstreet   2013-03-23  551  				goto err;
cafe563591446c Kent Overstreet   2013-03-23  552  
cafe563591446c Kent Overstreet   2013-03-23  553  			w->private	= io;
cafe563591446c Kent Overstreet   2013-03-23  554  			io->dc		= dc;
6e6ccc67b9c7a6 Michael Lyle      2018-01-08  555  			io->sequence    = sequence++;
cafe563591446c Kent Overstreet   2013-03-23  556  
cafe563591446c Kent Overstreet   2013-03-23  557  			dirty_init(w);
c34b7ac6508755 Christoph Hellwig 2022-12-06  558  			io->bio.bi_opf = REQ_OP_READ;
4f024f3797c43c Kent Overstreet   2013-10-11  559  			io->bio.bi_iter.bi_sector = PTR_OFFSET(&w->key, 0);
11e9560e6c005b Christoph Hellwig 2021-04-11  560  			bio_set_dev(&io->bio, dc->disk.c->cache->bdev);
cafe563591446c Kent Overstreet   2013-03-23  561  			io->bio.bi_end_io	= read_dirty_endio;
cafe563591446c Kent Overstreet   2013-03-23  562  
25d8be77e19224 Ming Lei          2017-12-18  563  			if (bch_bio_alloc_pages(&io->bio, GFP_KERNEL))
cafe563591446c Kent Overstreet   2013-03-23  564  				goto err_free;
cafe563591446c Kent Overstreet   2013-03-23  565  
c37511b863f36c Kent Overstreet   2013-04-26  566  			trace_bcache_writeback(&w->key);
cafe563591446c Kent Overstreet   2013-03-23  567  
c2a4f3183a1248 Kent Overstreet   2013-09-23  568  			down(&dc->in_flight);
539d39eb270834 Tang Junhui       2018-01-08  569  
3be11dbab67a3e Coly Li           2018-08-11  570  			/*
3be11dbab67a3e Coly Li           2018-08-11  571  			 * We've acquired a semaphore for the maximum
539d39eb270834 Tang Junhui       2018-01-08  572  			 * simultaneous number of writebacks; from here
539d39eb270834 Tang Junhui       2018-01-08  573  			 * everything happens asynchronously.
539d39eb270834 Tang Junhui       2018-01-08  574  			 */
5e6926daac267d Kent Overstreet   2013-07-24  575  			closure_call(&io->cl, read_dirty_submit, NULL, &cl);
539d39eb270834 Tang Junhui       2018-01-08  576  		}
cafe563591446c Kent Overstreet   2013-03-23  577  
539d39eb270834 Tang Junhui       2018-01-08  578  		delay = writeback_delay(dc, size);
539d39eb270834 Tang Junhui       2018-01-08  579  
771f393e8ffc9b Coly Li           2018-03-18  580  		while (!kthread_should_stop() &&
771f393e8ffc9b Coly Li           2018-03-18  581  		       !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
771f393e8ffc9b Coly Li           2018-03-18  582  		       delay) {
539d39eb270834 Tang Junhui       2018-01-08  583  			schedule_timeout_interruptible(delay);
539d39eb270834 Tang Junhui       2018-01-08  584  			delay = writeback_delay(dc, 0);
539d39eb270834 Tang Junhui       2018-01-08  585  		}
cafe563591446c Kent Overstreet   2013-03-23  586  	}
cafe563591446c Kent Overstreet   2013-03-23  587  
cafe563591446c Kent Overstreet   2013-03-23  588  	if (0) {
cafe563591446c Kent Overstreet   2013-03-23  589  err_free:
cafe563591446c Kent Overstreet   2013-03-23  590  		kfree(w->private);
cafe563591446c Kent Overstreet   2013-03-23 @591  err:
cafe563591446c Kent Overstreet   2013-03-23  592  	}
cafe563591446c Kent Overstreet   2013-03-23  593  
c2a4f3183a1248 Kent Overstreet   2013-09-23  594  	/*
c2a4f3183a1248 Kent Overstreet   2013-09-23  595  	 * Wait for outstanding writeback IOs to finish (and keybuf slots to be
c2a4f3183a1248 Kent Overstreet   2013-09-23  596  	 * freed) before refilling again
c2a4f3183a1248 Kent Overstreet   2013-09-23  597  	 */
5e6926daac267d Kent Overstreet   2013-07-24  598  	closure_sync(&cl);
448f60e858d871 Coly Li           2025-10-07  599  
448f60e858d871 Coly Li           2025-10-07  600  	if (atomic_read(&dc->writeback_keys.handled) == dump_nr) {
448f60e858d871 Coly Li           2025-10-07  601  		spin_lock(&dc->writeback_keys.lock);
448f60e858d871 Coly Li           2025-10-07  602  		dc->writeback_keys.keys = RB_ROOT;
448f60e858d871 Coly Li           2025-10-07  603  		array_allocator_init(&dc->writeback_keys.freelist);
448f60e858d871 Coly Li           2025-10-07  604  		spin_unlock(&dc->writeback_keys.lock);
448f60e858d871 Coly Li           2025-10-07  605  	} else {
448f60e858d871 Coly Li           2025-10-07  606  		for (i = 0; i < dump_nr; i++) {
448f60e858d871 Coly Li           2025-10-07  607  			w = dump_keys[i];
448f60e858d871 Coly Li           2025-10-07  608  			if (!w->private)
448f60e858d871 Coly Li           2025-10-07  609  				continue;
448f60e858d871 Coly Li           2025-10-07  610  			bch_keybuf_del(&dc->writeback_keys, w);
448f60e858d871 Coly Li           2025-10-07  611  		}
448f60e858d871 Coly Li           2025-10-07  612  	}
448f60e858d871 Coly Li           2025-10-07  613  	atomic_set(&dc->writeback_keys.handled, 0);
5e6926daac267d Kent Overstreet   2013-07-24  614  }
5e6926daac267d Kent Overstreet   2013-07-24  615  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bcache: avoid redundant access RB tree in read_dirty
  2025-10-07  9:02 [PATCH] bcache: avoid redundant access RB tree in read_dirty colyli
  2025-10-10 16:13 ` kernel test robot
  2025-10-10 16:56 ` kernel test robot
@ 2025-10-20 16:44 ` Pierre Juhen
       [not found] ` <050fe436-e629-4428-8e4d-33edd8985767@orange.fr>
  3 siblings, 0 replies; 7+ messages in thread
From: Pierre Juhen @ 2025-10-20 16:44 UTC (permalink / raw)
  To: colyli, linux-bcache; +Cc: Zhou Jifeng

Hi

I am on kernel 6.16.12.

I have had errors with bcache recently, And I lost my fronted 3 or 4 times :

oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 128: 
bad csum, 32768 bytes, offset 0
oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 64: 
bad csum, 22928 bytes, offset 0
oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 32: 
bad csum, 4848 bytes, offset 2
oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 48: 
bad csum, 14096 bytes, offset 0
oct. 20 15:37:40 pierre.juhen (udev-worker)[461]: nvme0n1p3: Process 
'bcache-register /dev/nvme0n1p3' failed with exit code 1.
oct. 20 15:37:40 pierre.juhen kernel: bcache: prio_read() bad csum 
reading priorities
oct. 20 15:37:40 pierre.juhen kernel: bcache: bch_cache_set_error() 
error on 448f191c-28df-4396-bc44-14d1f77c9005: IO error reading 
priorities, disabling caching
oct. 20 15:37:40 pierre.juhen kernel: bcache: register_bcache() error : 
failed to register device

I had to reconfigure everything after a disk problem.

I have been running bcache for years now, without any problems.

The only difference might be that I configured the frontend with the 
discard option.

The logical volume using bcache have also a discard option in fstab.

The frontend is on a Samsung 980 nvme disk.

Any hint ?

Thank you

Regards


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bcache: avoid redundant access RB tree in read_dirty
       [not found] ` <050fe436-e629-4428-8e4d-33edd8985767@orange.fr>
@ 2025-10-21  1:18   ` Coly Li
  2025-10-21  2:58     ` Discard option Pierre Juhen
  0 siblings, 1 reply; 7+ messages in thread
From: Coly Li @ 2025-10-21  1:18 UTC (permalink / raw)
  To: Pierre Juhen; +Cc: linux-bcache

> 2025年10月21日 00:39,Pierre Juhen <pierre.juhen@orange.fr> 写道:
> 
> Hi 
> I am on kernel 6.16.12.
> I have had errors with bcache recently, And I lost my fronted 3 or 4 times :
> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 128: bad csum, 32768 bytes, offset 0 
> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 64: bad csum, 22928 bytes, offset 0 
> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 32: bad csum, 4848 bytes, offset 2 
> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 48: bad csum, 14096 bytes, offset 0 
> oct. 20 15:37:40 pierre.juhen (udev-worker)[461]: nvme0n1p3: Process 'bcache-register /dev/nvme0n1p3' failed with exit code 1. 
> oct. 20 15:37:40 pierre.juhen kernel: bcache: prio_read() bad csum reading priorities 
> oct. 20 15:37:40 pierre.juhen kernel: bcache: bch_cache_set_error() error on 448f191c-28df-4396-bc44-14d1f77c9005: IO error reading priorities, disabling caching 
> oct. 20 15:37:40 pierre.juhen kernel: bcache: register_bcache() error : failed to register device
> 

I assume this email is irrelevant to the patch “bcache: avoid redundant access RB tree in read_dirty”, am I correct?


> I had to reconfigure everything after a disk problem.
> I have been running bcache for years now, without any problems.
> The only difference might be that I configured the frontend with the discard option.

The discard option is not recommended. Indeed in next merge window I will submit a patch series to drop the discard option.


> The logical volume using bcache have also a discard option in fstab.
> The frontend is on a Samsung 980 nvme disk. 

Try not to enable discard on cache device. This option will disappear soon.

I don’t know whether discard option of Samsung 980 nvme disk may change the content of discarded LBA or not, from NVMe spec, it could be zero-filled or undefined.
Anyway in current code discard doesn’t help performance, I suggest to not enable discard and see whether the issue still shows up.

My suggestion is: always use default configuration, all our test case and performance optimization are for default configurations.

Thanks.

Coly Li

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Discard option
  2025-10-21  1:18   ` Coly Li
@ 2025-10-21  2:58     ` Pierre Juhen
  2025-10-21  4:10       ` Coly Li
  0 siblings, 1 reply; 7+ messages in thread
From: Pierre Juhen @ 2025-10-21  2:58 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

Hi Coli,

I assume this email is irrelevant to the patch “bcache: avoid redundant access RB tree in read_dirty”, am I correct?

Yes you are;  sorry, I picked a message and forgot to change the title

The discard option is not recommended. Indeed in next merge window I will submit a patch series to drop the discard option.

OK, I will boot on a Live USB Key to wipe the caching device and create 
a new one without discard option.

But should I keep the discard mount option in fstab for the logical 
volumes inside bcache ?

Thanks,

Regards

_______________

Le 21/10/2025 à 03:18, Coly Li a écrit :
>> 2025年10月21日 00:39,Pierre Juhen <pierre.juhen@orange.fr> 写道:
>>
>> Hi
>> I am on kernel 6.16.12.
>> I have had errors with bcache recently, And I lost my fronted 3 or 4 times :
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 128: bad csum, 32768 bytes, offset 0
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 64: bad csum, 22928 bytes, offset 0
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 32: bad csum, 4848 bytes, offset 2
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 48: bad csum, 14096 bytes, offset 0
>> oct. 20 15:37:40 pierre.juhen (udev-worker)[461]: nvme0n1p3: Process 'bcache-register /dev/nvme0n1p3' failed with exit code 1.
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: prio_read() bad csum reading priorities
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: bch_cache_set_error() error on 448f191c-28df-4396-bc44-14d1f77c9005: IO error reading priorities, disabling caching
>> oct. 20 15:37:40 pierre.juhen kernel: bcache: register_bcache() error : failed to register device
>>
> I assume this email is irrelevant to the patch “bcache: avoid redundant access RB tree in read_dirty”, am I correct?
>
>
>> I had to reconfigure everything after a disk problem.
>> I have been running bcache for years now, without any problems.
>> The only difference might be that I configured the frontend with the discard option.
> The discard option is not recommended. Indeed in next merge window I will submit a patch series to drop the discard option.
>
>
>> The logical volume using bcache have also a discard option in fstab.
>> The frontend is on a Samsung 980 nvme disk.
> Try not to enable discard on cache device. This option will disappear soon.
>
> I don’t know whether discard option of Samsung 980 nvme disk may change the content of discarded LBA or not, from NVMe spec, it could be zero-filled or undefined.
> Anyway in current code discard doesn’t help performance, I suggest to not enable discard and see whether the issue still shows up.
>
> My suggestion is: always use default configuration, all our test case and performance optimization are for default configurations.
>
> Thanks.
>
> Coly Li
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Discard option
  2025-10-21  2:58     ` Discard option Pierre Juhen
@ 2025-10-21  4:10       ` Coly Li
  0 siblings, 0 replies; 7+ messages in thread
From: Coly Li @ 2025-10-21  4:10 UTC (permalink / raw)
  To: Pierre Juhen; +Cc: linux-bcache

On Tue, Oct 21, 2025 at 04:58:21AM +0800, Pierre Juhen wrote:
> Hi Coli,
> 
> I assume this email is irrelevant to the patch “bcache: avoid redundant access RB tree in read_dirty”, am I correct?
> 
> Yes you are;  sorry, I picked a message and forgot to change the title
> 
> The discard option is not recommended. Indeed in next merge window I will submit a patch series to drop the discard option.
> 
> OK, I will boot on a Live USB Key to wipe the caching device and create a
> new one without discard option.
> 
> But should I keep the discard mount option in fstab for the logical volumes
> inside bcache ?
> 

Just not enable the bcache internal discard is fine. Bcache handles outside
discard bios properly, if this is a file system mounted on top of a bcache
device, it is fine to keep discard option of the file system.


> 
> _______________
> 
> Le 21/10/2025 à 03:18, Coly Li a écrit :
> > > 2025年10月21日 00:39,Pierre Juhen <pierre.juhen@orange.fr> 写道:
> > > 
> > > Hi
> > > I am on kernel 6.16.12.
> > > I have had errors with bcache recently, And I lost my fronted 3 or 4 times :
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 128: bad csum, 32768 bytes, offset 0
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 64: bad csum, 22928 bytes, offset 0
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 32: bad csum, 4848 bytes, offset 2
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: journal_read_bucket() 48: bad csum, 14096 bytes, offset 0
> > > oct. 20 15:37:40 pierre.juhen (udev-worker)[461]: nvme0n1p3: Process 'bcache-register /dev/nvme0n1p3' failed with exit code 1.
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: prio_read() bad csum reading priorities
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: bch_cache_set_error() error on 448f191c-28df-4396-bc44-14d1f77c9005: IO error reading priorities, disabling caching
> > > oct. 20 15:37:40 pierre.juhen kernel: bcache: register_bcache() error : failed to register device
> > > 
> > I assume this email is irrelevant to the patch “bcache: avoid redundant access RB tree in read_dirty”, am I correct?
> > 
> > 
> > > I had to reconfigure everything after a disk problem.
> > > I have been running bcache for years now, without any problems.
> > > The only difference might be that I configured the frontend with the discard option.
> > The discard option is not recommended. Indeed in next merge window I will submit a patch series to drop the discard option.
> > 
> > 
> > > The logical volume using bcache have also a discard option in fstab.
> > > The frontend is on a Samsung 980 nvme disk.
> > Try not to enable discard on cache device. This option will disappear soon.
> > 
> > I don’t know whether discard option of Samsung 980 nvme disk may change the content of discarded LBA or not, from NVMe spec, it could be zero-filled or undefined.
> > Anyway in current code discard doesn’t help performance, I suggest to not enable discard and see whether the issue still shows up.
> > 
> > My suggestion is: always use default configuration, all our test case and performance optimization are for default configurations.
> > 
> > Thanks.
> > 
> > Coly Li
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-10-21  4:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-07  9:02 [PATCH] bcache: avoid redundant access RB tree in read_dirty colyli
2025-10-10 16:13 ` kernel test robot
2025-10-10 16:56 ` kernel test robot
2025-10-20 16:44 ` Pierre Juhen
     [not found] ` <050fe436-e629-4428-8e4d-33edd8985767@orange.fr>
2025-10-21  1:18   ` Coly Li
2025-10-21  2:58     ` Discard option Pierre Juhen
2025-10-21  4:10       ` Coly Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox