public inbox for linux-bcache@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 00/10] cache device failure handling improvement
@ 2018-01-03 14:03 Coly Li
  2018-01-03 14:03 ` [PATCH v1 01/10] bcache: exit bch_writeback_thread() with proper task state Coly Li
                   ` (10 more replies)
  0 siblings, 11 replies; 34+ messages in thread
From: Coly Li @ 2018-01-03 14:03 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, mlyle, tang.junhui, Coly Li

Hi maintainers and folks,

This patch set tries to improve cache device failure handling. A basic
idea to handle failed cache device is,
- Unregister cache set
- Detach all backing devices attached to this cache set
- Stop all bcache devices linked to this cache set
The above process is named 'cache set retire' by me. The result of cache
set retire is, cache set and bcache devices are all removed, following
I/O requests will get failed immediately to notift upper layer or user
space coce that the cache device is failed or disconnected.

The first 8 patches of this patch set is to fix existing bugs in bcache,
the last 2 patches do the real improvement. Order of applying these patches
is important, if the last 2 patches are applied firstly, kernel panic or
process hang will be observed. Therefore I suggest to apply the first 8
fixes, then apply the last 2 patches.

The patch set is tested with writethrough, writeback, writearound mode,
read/write/readwrite workloads, so far it works as expected. IMHO the
cache set retire logic is complicated, I need your help to review the
patches, any question is warmly wlecome.

Coly Li (10):
  bcache: exit bch_writeback_thread() with proper task state
  bcache: set task properly in allocator_wait()
  bcache: reduce cache_set devices iteration by devices_max_used
  bcache: fix cached_dev->count usage for bch_cache_set_error()
  bcache: stop dc->writeback_rate_update if cache set is stopping
  bcache: stop dc->writeback_rate_update, dc->writeback_thread earlier
  bcache: set error_limit correctly
  bcache: fix misleading error message in bch_count_io_errors()
  bcache: add io_disable to struct cache_set
  bcache: stop all attached bcache devices for a retired cache set

 drivers/md/bcache/alloc.c     |  5 ++---
 drivers/md/bcache/bcache.h    | 19 +++++++++++++++-
 drivers/md/bcache/btree.c     |  8 ++++---
 drivers/md/bcache/io.c        | 15 ++++++++-----
 drivers/md/bcache/journal.c   |  4 ++--
 drivers/md/bcache/request.c   | 26 ++++++++++++++++------
 drivers/md/bcache/super.c     | 51 +++++++++++++++++++++++++++++++++++--------
 drivers/md/bcache/sysfs.c     |  8 +++++--
 drivers/md/bcache/util.h      |  6 -----
 drivers/md/bcache/writeback.c | 51 +++++++++++++++++++++++++++++++++----------
 drivers/md/bcache/writeback.h |  4 +---
 11 files changed, 144 insertions(+), 53 deletions(-)

Thanks in advance.

Coly Li

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2018-01-08 16:01 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-03 14:03 [PATCH v1 00/10] cache device failure handling improvement Coly Li
2018-01-03 14:03 ` [PATCH v1 01/10] bcache: exit bch_writeback_thread() with proper task state Coly Li
2018-01-03 17:08   ` Michael Lyle
2018-01-05 17:05     ` Coly Li
2018-01-05 17:09       ` Michael Lyle
2018-01-08  7:09   ` Hannes Reinecke
2018-01-08 13:50     ` Coly Li
2018-01-03 14:03 ` [PATCH v1 02/10] bcache: set task properly in allocator_wait() Coly Li
2018-01-03 17:09   ` Michael Lyle
2018-01-05 17:11     ` Coly Li
2018-01-08  7:10   ` Hannes Reinecke
2018-01-03 14:03 ` [PATCH v1 03/10] bcache: reduce cache_set devices iteration by devices_max_used Coly Li
2018-01-03 17:11   ` Michael Lyle
2018-01-08  7:12   ` Hannes Reinecke
2018-01-03 14:03 ` [PATCH v1 04/10] bcache: fix cached_dev->count usage for bch_cache_set_error() Coly Li
2018-01-08  7:16   ` Hannes Reinecke
2018-01-03 14:03 ` [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping Coly Li
2018-01-08  7:22   ` Hannes Reinecke
2018-01-08 16:01     ` Coly Li
2018-01-03 14:03 ` [PATCH v1 06/10] bcache: stop dc->writeback_rate_update, dc->writeback_thread earlier Coly Li
2018-01-08  7:25   ` Hannes Reinecke
2018-01-03 14:03 ` [PATCH v1 07/10] bcache: set error_limit correctly Coly Li
2018-01-08  7:26   ` Hannes Reinecke
2018-01-03 14:03 ` [PATCH v1 08/10] bcache: fix misleading error message in bch_count_io_errors() Coly Li
2018-01-03 17:14   ` Michael Lyle
2018-01-08  7:27   ` Hannes Reinecke
2018-01-03 14:03 ` [PATCH v1 09/10] bcache: add io_disable to struct cache_set Coly Li
2018-01-08  7:30   ` Hannes Reinecke
2018-01-03 14:03 ` [PATCH v1 10/10] bcache: stop all attached bcache devices for a retired cache set Coly Li
2018-01-08  7:31   ` Hannes Reinecke
2018-01-03 17:07 ` [PATCH v1 00/10] cache device failure handling improvement Michael Lyle
2018-01-04  2:20   ` Coly Li
2018-01-04 17:46     ` Michael Lyle
2018-01-05  4:04       ` Coly Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox