From: Andrew Morton <akpm@linux-foundation.org>
To: mm-commits@vger.kernel.org,piaojun@huawei.com,mark@fasheh.com,junxiao.bi@oracle.com,joseph.qi@linux.alibaba.com,jlbec@evilplan.org,heming.zhao@suse.com,gechangwei@live.cn,zzzccc427@gmail.com,akpm@linux-foundation.org
Subject: + ocfs2-cluster-keep-heartbeat-local-node-stable.patch added to mm-nonmm-unstable branch
Date: Tue, 23 Jun 2026 12:01:08 -0700 [thread overview]
Message-ID: <20260623190109.496431F000E9@smtp.kernel.org> (raw)
The patch titled
Subject: ocfs2/cluster: keep heartbeat local node stable
has been added to the -mm mm-nonmm-unstable branch. Its filename is
ocfs2-cluster-keep-heartbeat-local-node-stable.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/ocfs2-cluster-keep-heartbeat-local-node-stable.patch
This patch will later appear in the mm-nonmm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Cen Zhang <zzzccc427@gmail.com>
Subject: ocfs2/cluster: keep heartbeat local node stable
Date: Tue, 16 Jun 2026 15:49:31 +0800
o2nm_node_local_store() handles local=0 by stopping o2net and setting
cl_local_node to O2NM_INVALID_NODE_NUM, but it leaves cl_has_local set.
That stale state makes o2nm_this_node() return 255, blocks a later local=1
attempt with -EBUSY, and can feed 255 to heartbeat users that call
o2nm_this_node() dynamically.
Clearing cl_has_local is required when the local node is reset. But
heartbeat threads can still be running at that point. They pin the local
node config item at startup, yet o2hb_do_disk_heartbeat() and thread
teardown re-read o2nm_this_node() for the local slot and for
o2nm_undepend_this_node(). Once local=0 has cleared the live local-node
state, those dynamic reads return O2NM_MAX_NODES, which is also the
invalid node number 255.
Store the local node number in the heartbeat region when the region
starts. Use that stable node for heartbeat slot writes/checks,
negotiation messages, and the final configfs undepend. Stop the heartbeat
loop when the current local node no longer matches the stored node, and
clear cl_has_local together with cl_local_node in the local=0 path so
nodemanager state matches node removal.
Validation reproduced this kernel report:
KASAN slab-out-of-bounds in o2hb_do_disk_heartbeat+0x372/0xb30
RIP: 0010:memset+0xf/0x20
Read of size 8
Call trace:
dump_stack_lvl+0x66/0xa0
print_report+0xd0/0x630
o2hb_do_disk_heartbeat+0x372/0xb30 (fs/ocfs2/cluster/heartbeat.c:1079)
srso_alias_return_thunk+0x5/0xfbef5
__virt_addr_valid+0x188/0x2f0
kasan_report+0xe4/0x120
o2hb_do_disk_heartbeat+0x5/0xb30 (fs/ocfs2/cluster/heartbeat.c:1079)
o2hb_thread+0x14e/0x770
kthread_affine_node+0x139/0x180
lockdep_hardirqs_on_prepare+0xda/0x190
trace_hardirqs_on+0x18/0x130
kthread+0x19d/0x1e0
ret_from_fork+0x37a/0x4d0
__switch_to+0x2d5/0x6f0
ret_from_fork_asm+0x1a/0x30
Link: https://lore.kernel.org/20260616074931.3774929-1-zzzccc427@gmail.com
Fixes: a7f6a5fb4bde ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Cen Zhang <zzzccc427@gmail.com>
Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/ocfs2/cluster/heartbeat.c | 43 ++++++++++++++++++++-----------
fs/ocfs2/cluster/nodemanager.c | 19 ++++++++++---
fs/ocfs2/cluster/nodemanager.h | 2 +
3 files changed, 46 insertions(+), 18 deletions(-)
--- a/fs/ocfs2/cluster/heartbeat.c~ocfs2-cluster-keep-heartbeat-local-node-stable
+++ a/fs/ocfs2/cluster/heartbeat.c
@@ -203,6 +203,7 @@ struct o2hb_region {
/* protected by the hr_callback_sem */
struct task_struct *hr_task;
+ u8 hr_node_num;
unsigned int hr_blocks;
unsigned long long hr_start_block;
@@ -350,12 +351,12 @@ static void o2hb_disarm_timeout(struct o
cancel_delayed_work_sync(®->hr_nego_timeout_work);
}
-static int o2hb_send_nego_msg(int key, int type, u8 target)
+static int o2hb_send_nego_msg(int key, int type, u8 target, u8 node_num)
{
struct o2hb_nego_msg msg;
int status, ret;
- msg.node_num = o2nm_this_node();
+ msg.node_num = node_num;
again:
ret = o2net_send_message(type, key, &msg, sizeof(msg),
target, &status);
@@ -373,8 +374,10 @@ static void o2hb_nego_timeout(struct wor
unsigned long live_node_bitmap[BITS_TO_LONGS(O2NM_MAX_NODES)];
int master_node, i, ret;
struct o2hb_region *reg;
+ u8 node_num;
reg = container_of(work, struct o2hb_region, hr_nego_timeout_work.work);
+ node_num = reg->hr_node_num;
/* don't negotiate timeout if last hb failed since it is very
* possible io failed. Should let write timeout fence self.
*/
@@ -385,10 +388,10 @@ static void o2hb_nego_timeout(struct wor
/* lowest node as master node to make negotiate decision. */
master_node = find_first_bit(live_node_bitmap, O2NM_MAX_NODES);
- if (master_node == o2nm_this_node()) {
+ if (master_node == node_num) {
if (!test_bit(master_node, reg->hr_nego_node_bitmap)) {
printk(KERN_NOTICE "o2hb: node %d hb write hung for %ds on region %s (%pg).\n",
- o2nm_this_node(), O2HB_NEGO_TIMEOUT_MS/1000,
+ node_num, O2HB_NEGO_TIMEOUT_MS / 1000,
config_item_name(®->hr_item), reg_bdev(reg));
set_bit(master_node, reg->hr_nego_node_bitmap);
}
@@ -417,7 +420,7 @@ static void o2hb_nego_timeout(struct wor
mlog(ML_HEARTBEAT, "send NEGO_APPROVE msg to node %d\n", i);
ret = o2hb_send_nego_msg(reg->hr_key,
- O2HB_NEGO_APPROVE_MSG, i);
+ O2HB_NEGO_APPROVE_MSG, i, node_num);
if (ret)
mlog(ML_ERROR, "send NEGO_APPROVE msg to node %d fail %d\n",
i, ret);
@@ -425,10 +428,10 @@ static void o2hb_nego_timeout(struct wor
} else {
/* negotiate timeout with master node. */
printk(KERN_NOTICE "o2hb: node %d hb write hung for %ds on region %s (%pg), negotiate timeout with node %d.\n",
- o2nm_this_node(), O2HB_NEGO_TIMEOUT_MS/1000, config_item_name(®->hr_item),
+ node_num, O2HB_NEGO_TIMEOUT_MS / 1000, config_item_name(®->hr_item),
reg_bdev(reg), master_node);
ret = o2hb_send_nego_msg(reg->hr_key, O2HB_NEGO_TIMEOUT_MSG,
- master_node);
+ master_node, node_num);
if (ret)
mlog(ML_ERROR, "send NEGO_TIMEOUT msg to node %d fail %d\n",
master_node, ret);
@@ -601,7 +604,9 @@ static int o2hb_issue_node_write(struct
o2hb_bio_wait_init(write_wc);
- slot = o2nm_this_node();
+ slot = reg->hr_node_num;
+ if (slot >= O2NM_MAX_NODES)
+ return -EINVAL;
bio = o2hb_setup_one_bio(reg, write_wc, &slot, slot+1,
REQ_OP_WRITE | REQ_SYNC);
@@ -670,8 +675,12 @@ static int o2hb_check_own_slot(struct o2
struct o2hb_disk_slot *slot;
struct o2hb_disk_heartbeat_block *hb_block;
char *errstr;
+ u8 node_num = reg->hr_node_num;
+
+ if (node_num >= O2NM_MAX_NODES)
+ return 0;
- slot = ®->hr_slots[o2nm_this_node()];
+ slot = ®->hr_slots[node_num];
/* Don't check on our 1st timestamp */
if (!slot->ds_last_time)
return 0;
@@ -712,7 +721,10 @@ static inline void o2hb_prepare_block(st
struct o2hb_disk_slot *slot;
struct o2hb_disk_heartbeat_block *hb_block;
- node_num = o2nm_this_node();
+ node_num = reg->hr_node_num;
+ if (node_num >= O2NM_MAX_NODES)
+ return;
+
slot = ®->hr_slots[node_num];
hb_block = (struct o2hb_disk_heartbeat_block *)slot->ds_raw_block;
@@ -1206,7 +1218,7 @@ static int o2hb_thread(void *data)
set_user_nice(current, MIN_NICE);
/* Pin node */
- ret = o2nm_depend_this_node();
+ ret = o2nm_depend_node(reg->hr_node_num);
if (ret) {
mlog(ML_ERROR, "Node has been deleted, ret = %d\n", ret);
reg->hr_node_deleted = 1;
@@ -1215,7 +1227,8 @@ static int o2hb_thread(void *data)
}
while (!kthread_should_stop() &&
- !reg->hr_unclean_stop && !reg->hr_aborted_start) {
+ !reg->hr_unclean_stop && !reg->hr_aborted_start &&
+ o2nm_this_node() == reg->hr_node_num) {
/* We track the time spent inside
* o2hb_do_disk_heartbeat so that we avoid more than
* hr_timeout_ms between disk writes. On busy systems
@@ -1264,7 +1277,7 @@ static int o2hb_thread(void *data)
}
/* Unpin node */
- o2nm_undepend_this_node();
+ o2nm_undepend_node(reg->hr_node_num);
mlog(ML_HEARTBEAT|ML_KTHREAD, "o2hb thread exiting\n");
@@ -1791,7 +1804,8 @@ static ssize_t o2hb_region_dev_store(str
/* We can't heartbeat without having had our node number
* configured yet. */
- if (o2nm_this_node() == O2NM_MAX_NODES)
+ reg->hr_node_num = o2nm_this_node();
+ if (reg->hr_node_num == O2NM_MAX_NODES)
return -EINVAL;
ret = kstrtol(p, 0, &fd);
@@ -2036,6 +2050,7 @@ static struct config_item *o2hb_heartbea
ret = -ENAMETOOLONG;
goto free;
}
+ reg->hr_node_num = O2NM_MAX_NODES;
spin_lock(&o2hb_live_lock);
reg->hr_region_num = 0;
--- a/fs/ocfs2/cluster/nodemanager.c~ocfs2-cluster-keep-heartbeat-local-node-stable
+++ a/fs/ocfs2/cluster/nodemanager.c
@@ -367,6 +367,7 @@ static ssize_t o2nm_node_local_store(str
if (!tmp && cluster->cl_has_local &&
cluster->cl_local_node == node->nd_num) {
o2net_stop_listening(node);
+ cluster->cl_has_local = 0;
cluster->cl_local_node = O2NM_INVALID_NODE_NUM;
}
@@ -782,12 +783,12 @@ void o2nm_undepend_item(struct config_it
configfs_undepend_item(item);
}
-int o2nm_depend_this_node(void)
+int o2nm_depend_node(u8 node_num)
{
int ret = 0;
struct o2nm_node *local_node;
- local_node = o2nm_get_node_by_num(o2nm_this_node());
+ local_node = o2nm_get_node_by_num(node_num);
if (!local_node) {
ret = -EINVAL;
goto out;
@@ -800,17 +801,27 @@ out:
return ret;
}
-void o2nm_undepend_this_node(void)
+void o2nm_undepend_node(u8 node_num)
{
struct o2nm_node *local_node;
- local_node = o2nm_get_node_by_num(o2nm_this_node());
+ local_node = o2nm_get_node_by_num(node_num);
BUG_ON(!local_node);
o2nm_undepend_item(&local_node->nd_item);
o2nm_node_put(local_node);
}
+int o2nm_depend_this_node(void)
+{
+ return o2nm_depend_node(o2nm_this_node());
+}
+
+void o2nm_undepend_this_node(void)
+{
+ o2nm_undepend_node(o2nm_this_node());
+}
+
static void __exit exit_o2nm(void)
{
--- a/fs/ocfs2/cluster/nodemanager.h~ocfs2-cluster-keep-heartbeat-local-node-stable
+++ a/fs/ocfs2/cluster/nodemanager.h
@@ -65,6 +65,8 @@ void o2nm_node_put(struct o2nm_node *nod
int o2nm_depend_item(struct config_item *item);
void o2nm_undepend_item(struct config_item *item);
+int o2nm_depend_node(u8 node_num);
+void o2nm_undepend_node(u8 node_num);
int o2nm_depend_this_node(void);
void o2nm_undepend_this_node(void);
_
Patches currently in -mm which might be from zzzccc427@gmail.com are
ocfs2-cluster-keep-heartbeat-local-node-stable.patch
reply other threads:[~2026-06-23 19:01 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260623190109.496431F000E9@smtp.kernel.org \
--to=akpm@linux-foundation.org \
--cc=gechangwei@live.cn \
--cc=heming.zhao@suse.com \
--cc=jlbec@evilplan.org \
--cc=joseph.qi@linux.alibaba.com \
--cc=junxiao.bi@oracle.com \
--cc=mark@fasheh.com \
--cc=mm-commits@vger.kernel.org \
--cc=piaojun@huawei.com \
--cc=zzzccc427@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.