From: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
To: "idryomov@gmail.com" <idryomov@gmail.com>,
"ionut.nechita@windriver.com" <ionut.nechita@windriver.com>
Cc: "ionut_n2001@yahoo.com" <ionut_n2001@yahoo.com>,
"sage@newdream.net" <sage@newdream.net>,
Xiubo Li <xiubli@redhat.com>,
"linux-rt-devel@lists.linux.dev" <linux-rt-devel@lists.linux.dev>,
"jkosina@suse.com" <jkosina@suse.com>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"slava@dubeyko.com" <slava@dubeyko.com>,
Alex Markuze <amarkuze@redhat.com>,
"jlayton@kernel.org" <jlayton@kernel.org>,
"bigeasy@linutronix.de" <bigeasy@linutronix.de>,
"clrkwllms@kernel.org" <clrkwllms@kernel.org>,
"superm1@kernel.org" <superm1@kernel.org>
Subject: RE: [PATCH] ceph: add timeout protection to ceph_mdsc_sync() path
Date: Tue, 17 Feb 2026 21:52:00 +0000 [thread overview]
Message-ID: <fe346c26386a4fc85cfa27e669dffd13191e7ea4.camel@ibm.com> (raw)
In-Reply-To: <20260213075111.32886-2-ionut.nechita@windriver.com>
On Fri, 2026-02-13 at 09:51 +0200, Ionut Nechita (Wind River) wrote:
> I also created a tracker issue for this on the Ceph bug tracker:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__tracker.ceph.com_issues_74897&d=DwIDaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=q5bIm4AXMzc8NJu1_RGmnQ2fMWKq4Y4RAkElvUgSs00&m=oVQ3XXnnOXYdQh1XLw3tF7NQtVn2RbspKR87xKMX9OaXwxMeG5-j9NZql6OVPhi1&s=RCmrpV6SMVfjurhivjMyHRm_bDekVEQl_uIhD5hbtno&e=
>
It looks like that I was able to reproduce the symptoms of the issue by multiple
runs of generic/013 xfstests' test-case:
#!/bin/bash
while true; do
sudo ./check generic/013
done
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.346895] INFO: task fsstress:14466
blocked for more than 122 seconds.
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.347995] Not tainted 6.19.0-rc8+
#10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.348530] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349426] task:fsstress state:D
stack:0 pid:14466 tgid:14466 ppid:14464 task
_flags:0x400140 flags:0x00080800
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349438] Call Trace:
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349441] <TASK>
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349445] __schedule+0xe8a/0x57f0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349457] ? kasan_save_stack+0x39/0x60
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349466] ? kasan_save_stack+0x26/0x60
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349471] ? kasan_save_track+0x14/0x40
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349475] ?
kasan_save_free_info+0x3b/0x60
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349485] ? __kasan_slab_free+0x7a/0xb0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349489] ?
ceph_mdsc_release_request+0x6a3/0x880
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349497] ?
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349502] ?
__kasan_check_write+0x14/0x30
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349507] ?
__pv_queued_spin_lock_slowpath+0xb04/0xf80
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349514] ? __pfx___schedule+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349520] ? __kasan_check_read+0x11/0x20
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349525] ?
__call_rcu_common+0x386/0x14b0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349532] schedule+0x75/0x2f0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349538] schedule_timeout+0x16d/0x210
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349542] ?
__pfx_schedule_timeout+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349548] ?
__kasan_check_write+0x14/0x30
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349553] ?
_raw_spin_lock_irq+0x8b/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349559] ?
__pfx__raw_spin_lock_irq+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349565] ? kasan_save_track+0x14/0x40
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349569]
wait_for_completion+0x14a/0x340
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349573] ?
__pfx_wait_for_completion+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349577] ?
__kasan_check_write+0x14/0x30
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349582] ? __pfx_mutex_unlock+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349587] ceph_mdsc_sync+0x4b4/0xe80
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349593] ?
__pfx_ceph_mdsc_sync+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349597] ?
ceph_osdc_put_request+0x38/0x770
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349603] ? ceph_osdc_sync+0x1cb/0x350
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349608] ceph_sync_fs+0xa0/0x4c0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349612] sync_filesystem+0x182/0x240
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349618] __x64_sys_syncfs+0xac/0x160
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349623] x64_sys_call+0x746/0x2360
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349629] do_syscall_64+0x82/0x5d0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349635] ? __x64_sys_openat+0x108/0x240
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349641] ? __kasan_check_read+0x11/0x20
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349647] ?
fpregs_assert_state_consistent+0x5c/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349655] ?
__pfx___x64_sys_openat+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349661] ?
__kasan_check_write+0x14/0x30
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349667] ? ksys_write+0x1a3/0x230
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349672] ? __kasan_check_read+0x11/0x20
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349677] ?
fpregs_assert_state_consistent+0x5c/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349682] ? do_syscall_64+0xbf/0x5d0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349687] ?
fpregs_assert_state_consistent+0x5c/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349692] ? __kasan_check_read+0x11/0x20
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349705] ?
fpregs_assert_state_consistent+0x5c/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349709] ? do_syscall_64+0xbf/0x5d0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349715] ? __kasan_check_read+0x11/0x20
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349720] ?
fpregs_assert_state_consistent+0x5c/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349724] ? irqentry_exit+0xa5/0x600
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349730] ? exc_page_fault+0x95/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349736]
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349740] RIP: 0033:0x792fb1d1ba4b
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349745] RSP: 002b:00007ffc3844eb58
EFLAGS: 00000246 ORIG_RAX: 0000000000000132
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349752] RAX: ffffffffffffffda RBX:
0000000000000000 RCX: 0000792fb1d1ba4b
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349756] RDX: 0000000000000000 RSI:
000059045610b440 RDI: 0000000000000004
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349759] RBP: 0000000000000004 R08:
0000000000000026 R09: 00007ffc3844e986
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349762] R10: 0000000000000000 R11:
0000000000000246 R12: 0000000000000149
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349765] R13: 00007ffc3844eba0 R14:
000059042de9d0b3 R15: 0000000000000149
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349771] </TASK>
l *ceph_mdsc_sync+0x4b4
0xffffffff82cddbe4 is in ceph_mdsc_sync (fs/ceph/mds_client.c:5916).
5911 }
5912 doutc(cl, "wait on %llu (want %llu)\n",
5913 req->r_tid, want_tid);
5914 wait_for_completion(&req->r_safe_completion);
5915
5916 mutex_lock(&mdsc->mutex);
5917 ceph_mdsc_put_request(req);
5918 if (!nextreq)
5919 break; /* next dne before, so we're done! */
5920 if (RB_EMPTY_NODE(&nextreq->r_node)) {
I am not sure yet that reason is the same.
Thanks,
Slava.
next prev parent reply other threads:[~2026-02-17 21:52 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-08 13:18 [PATCH] ceph: add timeout protection to ceph_mdsc_sync() path Ionut Nechita (Wind River)
2026-02-09 23:03 ` Viacheslav Dubeyko
2026-02-11 7:21 ` Sebastian Andrzej Siewior
2026-02-13 7:51 ` Ionut Nechita (Wind River)
2026-02-17 21:52 ` Viacheslav Dubeyko [this message]
2026-02-18 19:57 ` Ionut Nechita (Wind River)
2026-02-18 20:04 ` Viacheslav Dubeyko
2026-02-19 9:37 ` Alex Markuze
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fe346c26386a4fc85cfa27e669dffd13191e7ea4.camel@ibm.com \
--to=slava.dubeyko@ibm.com \
--cc=amarkuze@redhat.com \
--cc=bigeasy@linutronix.de \
--cc=ceph-devel@vger.kernel.org \
--cc=clrkwllms@kernel.org \
--cc=idryomov@gmail.com \
--cc=ionut.nechita@windriver.com \
--cc=ionut_n2001@yahoo.com \
--cc=jkosina@suse.com \
--cc=jlayton@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=rostedt@goodmis.org \
--cc=sage@newdream.net \
--cc=slava@dubeyko.com \
--cc=superm1@kernel.org \
--cc=xiubli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox