From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Mingzhe Zou <mingzhe.zou@easystack.cn>,
Coly Li <colyli@suse.de>, Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 5.15 58/69] bcache: fixup lock c->root error
Date: Thu, 30 Nov 2023 16:22:55 +0000 [thread overview]
Message-ID: <20231130162134.962040542@linuxfoundation.org> (raw)
In-Reply-To: <20231130162133.035359406@linuxfoundation.org>
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mingzhe Zou <mingzhe.zou@easystack.cn>
commit e34820f984512b433ee1fc291417e60c47d56727 upstream.
We had a problem with io hung because it was waiting for c->root to
release the lock.
crash> cache_set.root -l cache_set.list ffffa03fde4c0050
root = 0xffff802ef454c800
crash> btree -o 0xffff802ef454c800 | grep rw_semaphore
[ffff802ef454c858] struct rw_semaphore lock;
crash> struct rw_semaphore ffff802ef454c858
struct rw_semaphore {
count = {
counter = -4294967297
},
wait_list = {
next = 0xffff00006786fc28,
prev = 0xffff00005d0efac8
},
wait_lock = {
raw_lock = {
{
val = {
counter = 0
},
{
locked = 0 '\000',
pending = 0 '\000'
},
{
locked_pending = 0,
tail = 0
}
}
}
},
osq = {
tail = {
counter = 0
}
},
owner = 0xffffa03fdc586603
}
The "counter = -4294967297" means that lock count is -1 and a write lock
is being attempted. Then, we found that there is a btree with a counter
of 1 in btree_cache_freeable.
crash> cache_set -l cache_set.list ffffa03fde4c0050 -o|grep btree_cache
[ffffa03fde4c1140] struct list_head btree_cache;
[ffffa03fde4c1150] struct list_head btree_cache_freeable;
[ffffa03fde4c1160] struct list_head btree_cache_freed;
[ffffa03fde4c1170] unsigned int btree_cache_used;
[ffffa03fde4c1178] wait_queue_head_t btree_cache_wait;
[ffffa03fde4c1190] struct task_struct *btree_cache_alloc_lock;
crash> list -H ffffa03fde4c1140|wc -l
973
crash> list -H ffffa03fde4c1150|wc -l
1123
crash> cache_set.btree_cache_used -l cache_set.list ffffa03fde4c0050
btree_cache_used = 2097
crash> list -s btree -l btree.list -H ffffa03fde4c1140|grep -E -A2 "^ lock = {" > btree_cache.txt
crash> list -s btree -l btree.list -H ffffa03fde4c1150|grep -E -A2 "^ lock = {" > btree_cache_freeable.txt
[root@node-3 127.0.0.1-2023-08-04-16:40:28]# pwd
/var/crash/127.0.0.1-2023-08-04-16:40:28
[root@node-3 127.0.0.1-2023-08-04-16:40:28]# cat btree_cache.txt|grep counter|grep -v "counter = 0"
[root@node-3 127.0.0.1-2023-08-04-16:40:28]# cat btree_cache_freeable.txt|grep counter|grep -v "counter = 0"
counter = 1
We found that this is a bug in bch_sectors_dirty_init() when locking c->root:
(1). Thread X has locked c->root(A) write.
(2). Thread Y failed to lock c->root(A), waiting for the lock(c->root A).
(3). Thread X bch_btree_set_root() changes c->root from A to B.
(4). Thread X releases the lock(c->root A).
(5). Thread Y successfully locks c->root(A).
(6). Thread Y releases the lock(c->root B).
down_write locked ---(1)----------------------┐
| |
| down_read waiting ---(2)----┐ |
| | ┌-------------┐ ┌-------------┐
bch_btree_set_root ===(3)========>> | c->root A | | c->root B |
| | └-------------┘ └-------------┘
up_write ---(4)---------------------┘ | |
| | |
down_read locked ---(5)-----------┘ |
| |
up_read ---(6)-----------------------------┘
Since c->root may change, the correct steps to lock c->root should be
the same as bch_root_usage(), compare after locking.
static unsigned int bch_root_usage(struct cache_set *c)
{
unsigned int bytes = 0;
struct bkey *k;
struct btree *b;
struct btree_iter iter;
goto lock_root;
do {
rw_unlock(false, b);
lock_root:
b = c->root;
rw_lock(false, b, b->level);
} while (b != c->root);
for_each_key_filter(&b->keys, k, &iter, bch_ptr_bad)
bytes += bkey_bytes(k);
rw_unlock(false, b);
return (bytes * 100) / btree_bytes(c);
}
Fixes: b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded")
Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20231120052503.6122-7-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/md/bcache/writeback.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -967,14 +967,22 @@ static int bch_btre_dirty_init_thread_nr
void bch_sectors_dirty_init(struct bcache_device *d)
{
int i;
+ struct btree *b = NULL;
struct bkey *k = NULL;
struct btree_iter iter;
struct sectors_dirty_init op;
struct cache_set *c = d->c;
struct bch_dirty_init_state state;
+retry_lock:
+ b = c->root;
+ rw_lock(0, b, b->level);
+ if (b != c->root) {
+ rw_unlock(0, b);
+ goto retry_lock;
+ }
+
/* Just count root keys if no leaf node */
- rw_lock(0, c->root, c->root->level);
if (c->root->level == 0) {
bch_btree_op_init(&op.op, -1);
op.inode = d->id;
@@ -987,7 +995,7 @@ void bch_sectors_dirty_init(struct bcach
sectors_dirty_init_fn(&op.op, c->root, k);
}
- rw_unlock(0, c->root);
+ rw_unlock(0, b);
return;
}
@@ -1024,7 +1032,7 @@ void bch_sectors_dirty_init(struct bcach
out:
/* Must wait for all threads to stop. */
wait_event(state.wait, atomic_read(&state.started) == 0);
- rw_unlock(0, c->root);
+ rw_unlock(0, b);
}
void bch_cached_dev_writeback_init(struct cached_dev *dc)
next prev parent reply other threads:[~2023-11-30 16:33 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-30 16:21 [PATCH 5.15 00/69] 5.15.141-rc1 review Greg Kroah-Hartman
2023-11-30 16:21 ` [PATCH 5.15 01/69] afs: Fix afs_server_list to be cleaned up with RCU Greg Kroah-Hartman
2023-11-30 16:21 ` [PATCH 5.15 02/69] afs: Make error on cell lookup failure consistent with OpenAFS Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 03/69] drm/panel: boe-tv101wum-nl6: Fine tune the panel power sequence Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 04/69] drm/panel: auo,b101uan08.3: " Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 05/69] drm/panel: simple: Fix Innolux G101ICE-L01 bus flags Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 06/69] drm/panel: simple: Fix Innolux G101ICE-L01 timings Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 07/69] wireguard: use DEV_STATS_INC() Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 08/69] octeontx2-pf: Fix memory leak during interface down Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 09/69] ata: pata_isapnp: Add missing error check for devm_ioport_map() Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 10/69] drm/rockchip: vop: Fix color for RGB888/BGR888 format on VOP full Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 11/69] HID: core: store the unique system identifier in hid_device Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 12/69] HID: fix HID device resource race between HID core and debugging support Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 13/69] ipv4: Correct/silence an endian warning in __ip_do_redirect Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 14/69] net: usb: ax88179_178a: fix failed operations during ax88179_reset Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 15/69] net/smc: avoid data corruption caused by decline Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 16/69] arm/xen: fix xen_vcpu_info allocation alignment Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 17/69] octeontx2-pf: Fix ntuple rule creation to direct packet to VF with higher Rx queue than its PF Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 18/69] amd-xgbe: handle corner-case during sfp hotplug Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 19/69] amd-xgbe: handle the corner-case during tx completion Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 20/69] amd-xgbe: propagate the correct speed and duplex status Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 21/69] net: axienet: Fix check for partial TX checksum Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 22/69] afs: Return ENOENT if no cell DNS record can be found Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 23/69] afs: Fix file locking on R/O volumes to operate in local mode Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 24/69] nvmet: nul-terminate the NQNs passed in the connect command Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 25/69] USB: dwc3: qcom: fix resource leaks on probe deferral Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 26/69] USB: dwc3: qcom: fix ACPI platform device leak Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 27/69] lockdep: Fix block chain corruption Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 28/69] MIPS: KVM: Fix a build warning about variable set but not used Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 29/69] media: camss: Replace hard coded value with parameter Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 30/69] media: camss: sm8250: Virtual channels for CSID Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 31/69] media: qcom: camss: Fix set CSI2_RX_CFG1_VC_MODE when VC is greater than 3 Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 32/69] media: qcom: camss: Fix csid-gen2 for test pattern generator Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 33/69] ext4: add a new helper to check if es must be kept Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 34/69] ext4: factor out __es_alloc_extent() and __es_free_extent() Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 35/69] ext4: use pre-allocated es in __es_insert_extent() Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 36/69] ext4: use pre-allocated es in __es_remove_extent() Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 37/69] ext4: using nofail preallocation in ext4_es_remove_extent() Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 38/69] ext4: using nofail preallocation in ext4_es_insert_delayed_block() Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 39/69] ext4: using nofail preallocation in ext4_es_insert_extent() Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 40/69] ext4: fix slab-use-after-free " Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 41/69] ext4: make sure allocate pending entry not fail Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 42/69] tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 43/69] proc: sysctl: prevent aliased sysctls from getting passed to init Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 44/69] ACPI: resource: Skip IRQ override on ASUS ExpertBook B1402CVA Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 45/69] swiotlb-xen: provide the "max_mapping_size" method Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 46/69] bcache: replace a mistaken IS_ERR() by IS_ERR_OR_NULL() in btree_gc_coalesce() Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 47/69] md: fix bi_status reporting in md_end_clone_io Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 48/69] bcache: fixup multi-threaded bch_sectors_dirty_init() wake-up race Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 49/69] io_uring/fs: consider link->flags when getting path for LINKAT Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 50/69] s390/dasd: protect device queue against concurrent access Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 51/69] USB: serial: option: add Luat Air72*U series products Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 52/69] hv_netvsc: Fix race of register_netdevice_notifier and VF register Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 53/69] hv_netvsc: Mark VF as slave before exposing it to user-mode Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 54/69] dm-delay: fix a race between delay_presuspend and delay_bio Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 55/69] bcache: check return value from btree_node_alloc_replacement() Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 56/69] bcache: prevent potential division by zero error Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 57/69] bcache: fixup init dirty data errors Greg Kroah-Hartman
2023-11-30 16:22 ` Greg Kroah-Hartman [this message]
2023-11-30 16:22 ` [PATCH 5.15 59/69] usb: cdnsp: Fix deadlock issue during using NCM gadget Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 60/69] USB: serial: option: add Fibocom L7xx modules Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 61/69] USB: serial: option: fix FM101R-GL defines Greg Kroah-Hartman
2023-11-30 16:22 ` [PATCH 5.15 62/69] USB: serial: option: dont claim interface 4 for ZTE MF290 Greg Kroah-Hartman
2023-11-30 16:23 ` [PATCH 5.15 63/69] usb: typec: tcpm: Skip hard reset when in error recovery Greg Kroah-Hartman
2023-11-30 16:23 ` [PATCH 5.15 64/69] USB: dwc2: write HCINT with INTMASK applied Greg Kroah-Hartman
2023-11-30 16:23 ` [PATCH 5.15 65/69] usb: dwc3: Fix default mode initialization Greg Kroah-Hartman
2023-11-30 16:23 ` [PATCH 5.15 66/69] usb: dwc3: set the dma max_seg_size Greg Kroah-Hartman
2023-11-30 16:23 ` [PATCH 5.15 67/69] USB: dwc3: qcom: fix software node leak on probe errors Greg Kroah-Hartman
2023-11-30 16:23 ` [PATCH 5.15 68/69] USB: dwc3: qcom: fix wakeup after probe deferral Greg Kroah-Hartman
2023-11-30 16:23 ` [PATCH 5.15 69/69] io_uring: fix off-by one bvec index Greg Kroah-Hartman
2023-11-30 17:21 ` [PATCH 5.15 00/69] 5.15.141-rc1 review Daniel Díaz
2023-11-30 17:44 ` Guenter Roeck
2023-11-30 18:11 ` Daniel Díaz
2023-11-30 18:56 ` Guenter Roeck
2023-12-01 8:21 ` Greg Kroah-Hartman
2023-12-01 9:35 ` Francis Laniel
2023-12-01 9:44 ` Greg Kroah-Hartman
2023-12-01 14:34 ` Daniel Díaz
2023-12-01 23:04 ` Greg Kroah-Hartman
2023-12-04 14:55 ` Daniel Díaz
2023-12-01 6:31 ` Harshit Mogalapalli
2023-11-30 22:27 ` Pavel Machek
2023-11-30 18:57 ` Florian Fainelli
2023-12-01 0:08 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231130162134.962040542@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=axboe@kernel.dk \
--cc=colyli@suse.de \
--cc=mingzhe.zou@easystack.cn \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.