linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: linux-xfs@vger.kernel.org
Subject: [Bug 214767] xfs seems to hang due to race condition? maybe related to (gratuitous) thaw.
Date: Wed, 20 Oct 2021 16:31:07 +0000	[thread overview]
Message-ID: <bug-214767-201763-GZw49KOx95@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-214767-201763@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=214767

--- Comment #3 from Christian Theune (ct@flyingcircus.io) ---
I have another machine that managed to break free after 20 minutes which did
around 3 thaws around that time:

Oct 20 18:05:22 pixometerstag04 kernel: INFO: task nix-daemon:1387736 blocked
for more than 1228 seconds.
Oct 20 18:05:22 pixometerstag04 kernel:       Not tainted 5.10.70 #1-NixOS
Oct 20 18:05:22 pixometerstag04 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 20 18:05:22 pixometerstag04 kernel: task:nix-daemon      state:D stack:   
0 pid:1387736 ppid:  1356 flags:0x00000000
Oct 20 18:05:22 pixometerstag04 kernel: Call Trace:
Oct 20 18:05:22 pixometerstag04 kernel:  __schedule+0x271/0x860
Oct 20 18:05:22 pixometerstag04 kernel:  schedule+0x46/0xb0
Oct 20 18:05:22 pixometerstag04 kernel:  xfs_log_commit_cil+0x6a4/0x800 [xfs]
Oct 20 18:05:22 pixometerstag04 kernel:  ? wake_up_q+0xa0/0xa0
Oct 20 18:05:22 pixometerstag04 kernel:  __xfs_trans_commit+0x9d/0x310 [xfs]
Oct 20 18:05:22 pixometerstag04 kernel:  xfs_create+0x472/0x560 [xfs]
Oct 20 18:05:22 pixometerstag04 kernel:  xfs_generic_create+0x247/0x320 [xfs]
Oct 20 18:05:22 pixometerstag04 kernel:  ? xfs_lookup+0x55/0x100 [xfs]
Oct 20 18:05:22 pixometerstag04 kernel:  path_openat+0xdd7/0x1070
Oct 20 18:05:22 pixometerstag04 kernel:  do_filp_open+0x88/0x130
Oct 20 18:05:22 pixometerstag04 kernel:  ? getname_flags.part.0+0x29/0x1a0
Oct 20 18:05:22 pixometerstag04 kernel:  do_sys_openat2+0x97/0x150
Oct 20 18:05:22 pixometerstag04 kernel:  __x64_sys_openat+0x54/0x90
Oct 20 18:05:22 pixometerstag04 kernel:  do_syscall_64+0x33/0x40
Oct 20 18:05:22 pixometerstag04 kernel: 
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct 20 18:05:22 pixometerstag04 kernel: RIP: 0033:0x7f3e3114dea8
Oct 20 18:05:22 pixometerstag04 kernel: RSP: 002b:00007ffd92f7df10 EFLAGS:
00000293 ORIG_RAX: 0000000000000101
Oct 20 18:05:22 pixometerstag04 kernel: RAX: ffffffffffffffda RBX:
00000000000800c1 RCX: 00007f3e3114dea8
Oct 20 18:05:22 pixometerstag04 kernel: RDX: 00000000000800c1 RSI:
000000000195cb50 RDI: 00000000ffffff9c
Oct 20 18:05:22 pixometerstag04 kernel: RBP: 000000000195cb50 R08:
0000000000000000 R09: 0000000000000003
Oct 20 18:05:22 pixometerstag04 kernel: R10: 00000000000001b6 R11:
0000000000000293 R12: 00007ffd92f7e300
Oct 20 18:05:22 pixometerstag04 kernel: R13: 00007ffd92f7dfc0 R14:
00007ffd92f7dfb0 R15: 00007ffd92f7fd30

However, the first thaw came between when the machine already reported 122 and
245s of hangs and the last thaw came after it broke free.

Interestingly it also logs a variety of things that worked while those were
logged, maybe it only happened for the /tmp filesystem and not / ... apparently
postgresql was still writing stuff on the disk in that period, so I'd guess
this only happened on /tmp

This is square in the middle of the reported blocks:

Oct 20 17:52:00 pixometerstag04 postgres[1008]: user=,db= LOG:  checkpoint
complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0
recycled; write=0.004 s, sync=0.001 s, total=0.018 s; sync fi>

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

  parent reply	other threads:[~2021-10-20 16:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-20  7:36 [Bug 214767] New: xfs seems to hang due to race condition? maybe related to (gratuitous) thaw bugzilla-daemon
2021-10-20  7:39 ` [Bug 214767] " bugzilla-daemon
2021-10-20 16:16 ` bugzilla-daemon
2021-10-20 16:31 ` bugzilla-daemon [this message]
2021-10-20 18:29 ` bugzilla-daemon
2021-10-20 18:37 ` bugzilla-daemon
2021-10-20 19:22 ` bugzilla-daemon
2021-10-20 19:26 ` bugzilla-daemon
2021-10-20 22:16 ` [Bug 214767] New: " Dave Chinner
2021-10-20 22:38 ` [Bug 214767] " bugzilla-daemon
2021-10-21  4:21 ` bugzilla-daemon
2021-10-21 13:17 ` bugzilla-daemon
2021-10-21 13:19 ` bugzilla-daemon
2021-10-21 13:20 ` bugzilla-daemon
2021-10-29  2:21 ` bugzilla-daemon
2021-10-29  2:22 ` bugzilla-daemon
2021-11-09  5:19 ` bugzilla-daemon
2021-11-10 15:16 ` bugzilla-daemon
2021-11-20 22:31   ` Dave Chinner
2021-11-20 22:31 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-214767-201763-GZw49KOx95@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).